October 26, 2023

Writing device drivers in C

 The core element of any operating system is a collection of device drivers. These drivers ensures that hardware components like a mouse, the graphics card, ethernet cards and keyboard is available for the end user. The only programming language for implementing a device driver is C. C gets compiled into assembly instructions and this ensures the maximum efficiency.

Suppose the idea is to realize an operating system in Forth with the aim to run it on a stackmachine. In such a case, the c language isn't available but the user is forced to rewrite the device drivers in Forth. This will produce a situation in which all the code isn't written yet but has to be rewritten. It will take lots of man years to rewrite c device driver in a Forth dialect. Even if the programmers are highly motivated, they won't be able to fullfill the task within the next 30 years.

One possible attempt to overcome the Forth bottleneck is a virtual machine and high level languages like BASIC. The basic programming language gets converted into byte code which is executed on a virtual machine. The virtual machine is running on top of a Forth chip so the programmer doesn't need to program in Forth anymore. The only problem is, that BASIC is a high level programming language while device drivers are written in a low level language.

Unfurtunately, it is not possible to execute C in a virtual machine because C code needs direct hardware access. The only way to execute c code is by compiling it into assembly language. But compiling c code into assembly is only possible for register machine, not for stackmachines. There is no such thing available like a C to forth converter, and even it is possible to implement such a thing it can't be applied to device drivers.

Device drivers are an important element of any operating system. They make sure that all the hardware like printers, usb port, webcam and so on are working. Programming an operating system by ignoring the device won't make sense. It seems that only the C  low level language is only option for writing device drivers. This situation makes it unlikely, that operating systems will work on stackmachines.

The problem is not located in technical terms. Forth is a great language for getting direct access to hardware. There are some microcontrollers available which are running with Forth at bare metal. The more serious problem is, that a desktop operating system consists of thousnands of different hardware devices. Writing the code for all the devices will take endless amount of man years.[1] This effort is very costly. Rewriting existing C device drivers into Forth is to expensive. This prevents that such a project gets started. It seems, that the x86 architecture is the only valid computer system which is able to run desktop operating systems.

Perhaps it makes sense to go a step backward and understand why exactly C was choosen in mainstream computing. The goal was to write device drivers which contains of millions of codelines. Instead of writing this code in assembly language which is different for each processor, the idea was to write the device drivers in C. C is more portable than Assembly and is easier to learn. What is needed in addition is a compiler for generating assembly instructions automatically. This paradigm is valid in computing since decades.

The only bottleneck for a c compiler is, that it needs a certain target architecture which is a compiler friendly x86 architecture. Possbile alternatives like a RISC Cpu and especially a Forth cpu are preventing that C code gets converted into assembly instructions.  

The linux kernel contains of at least 5 million lines of code reserved for device drivers.[1] A potential Linux alternative written in Forth has to provide the same functionality. From a technical point of view it is possible to rewrite the device drivers in Forth, but from an economical perspective it doesn't make sense. Its a well known fact that a single programmer can write down only 10 lines of code per day no matter which programming language he prefers. And the open question is who exactly should write all the code in Forth?

There is a certain reason available why all the desktop operating systems were written in C. Because existing code was written mostly in C and it is much easier to add something to an existing codebase than rewriting it from scratch. The untold assumption is, that all the 5 million lines of code are needed, otherwise the computer isn't able to detect or manage a certain hardware for example a graphics card or a network card. The second assumption is, that even the Forth language will need device drivers. It is not possible to write a Forth OS in 10k lines of code which provides the same functionality like the existing device drivers which are written in 5 million lines of code.  This might explain why Forth is not very popular in mainstream computing. Even if the concept is interesting from a theoretical point of view it can't answer the question how to write all the source code which is needed in an operating system. Existing Forth tutorials are explaining to the newbie what a stack machine is about and how to combine Forth words into programs. But this ability is not enough to realize full blown desktop opeating systems in the style of Linux, MacOS or Windows.

In contrast the C language explains very well how to handle complexity. According to the C language paradigm the programmer has to write C code for newly hardware, commit this C into the existing codebase and this will improve the functionality of the Linux kernel. That means, there are 5 million LoC already there, and a new device driver will add around 200 lines of code and its only a detail question how to program the code exactly.

So called Forth systems and stack oriented programming languages like Factorcode are ignoring the problem of device drivers. Especially the aspect how to create millions of codelines to get access to endless amount of existing hardware.

[1] Kadav, Asim, and Michael M. Swift. "Understanding modern device drivers." ACM SIGPLAN Notices 47.4 (2012): 87-98.