March 31, 2023

Document converter from Lyx to MS-Word

 There is some sort of conflict visible between the LaTeX community and the MS-Word ecosystem. LaTeX claims, that it is superior over all the other other word processors and that there is a need to install a 20 GB texlive distribution only to create a simple pdf file. In contrast, the MS-Word community believes that MS Word is easier to use and that it encourages collaborative editing.

It is obvious that this conflict can't be solved within the near future but perhaps there is a third option available. From a technical point it is possible to export from one format into another. A Lyx created document can be stored into the .docx format and submitted to anybody without indoctrinating to use Lyx and LaTeX as well.




The same workflow is possible by creating the document in markdown. Converting markdown into .docx is also possible in an automated way. There is no need that all the users are preferring the same software tool with the same operating system but it is simply a personal choice how to create a text.

Similar to the HTML file format, the .docx format is some sort of universal standard which is understood by everybody. On the other hand there is no need to prefer this format for creating new documents but there are many reasons which are speaking against MS Word as a typesetting software.



According to the Lyx dialog window, there are endless amount of export formats like DVI, HTML, odf, plain text and the mentioned .docx format. The ability to export a document is important to ensure that different users groups can stay within their self created space. user1 is using Lyx, user2 prefers Word, user3 is using Libreoffice and so on. There is no reason why the world needs a single word processing software which is accepted by all. Even LaTeX is not the best tool. There are endless amount of reasons which are speaking against the software. The only thing what can be accepted as the smallest standard is the ability to convert back and forth between different systems. It makes sense if software A can store a file in the format of software B.

This understanding has much in common with natural language. Of course, there is the English language available. But at the same time, many reasons are speaking against English. Instead of encouraging people to learn English, the better idea is to provide a translator which converts from Language A into language B.

The prediction is, that in the future we will see more document formats but not less. What the LaTeX community won't do is to switch to the MS Word ecosystem. What the world really needs are better export filters which allows to use all the programs at the same time.

The best example why it makes no sense to decide between MS-Word vs LateX is the existence of the Scrivener software. Scrivener is an outliner which is used in addition to a word processor. Similar to MS-Word it is working with the WYSIWYG paradigm and demonstrates that even within the MS Word community there is no unity available for a certain software to write a document. The same situation is there within the LaTeX community. There is not the single user group which is working with the same software, but there are a dozens of frontends and macro packages available which are incompatible to each other. The only option to establish a single ecosystem which is accepted by all is to program converters which makes it easy to transfer data between incompatible systems.


March 29, 2023

LaTeX Questionnaire

Version history:
- April 10, 2023, added more questions
- April 16, 2023, increased questions to 70+
- April 23, 2023, increased questions to 80+
- May 5, 2023, increased questions to 90+
- May 25, 2023, increased questions to 100+

Please answer the following 100+ questions from the perspective of an LaTeX expert!

1 General
- Is there a built-in spelling correction?
- Can the hyphenation be deactivated?
- How many disc space is needed for a LaTeX distribution on the hard drive?
- Does the BibTeX format supports Unicode characters?
- Is LaTeX standardized?
- What is the preferred output media (e.g. physical paper, screen, mp3 audio)?

1a Programming
- In which programming language was the TeX software created?
- Which amount of programming skills are required to create LaTeX documents?
- Are programming skills required to create LaTeX packages from scratch?
- Are you using self created undocumented TeX macros within the document?
- How complicated is it to learn the MetaPost language?

2 File export
- Can MS-Word open a LaTeX document?
- Is it possible to export into the PDF/A document format?
- Is it difficult to copy&paste text from a LaTeX generated PDF document into other programs?
- Is LaTeX a good choice for creating non-pdf documents like blog entries and API documentation?
- Can Lualatex render a .docx file into the pdf format?
- Which kind of layout bugs are visible after converting LaTeX documents into the docx format with pandoc?

2a Team work
- Is LaTeX recommended for interdisciplinary teams with a focus on high productivity?
- What is better: a poorly written dissertation formatted in LaTeX or a well written one typesetted in MS-Word?
- Is it easy for newbies to become familiar with LaTeX?
- Was the LaTeX system created for the masses, especially for computer newbies?
- Has LaTeX a built-in version control system to record and track changes?

2b epub
- How to export into the epub format?
- How important is a fully justified layout for ebooks (e.g. low, medium, high)?
- How many ebub books are fully justified?
- Is the Plos One megajournal formatted in fully justified mode?

2c Lyx
- Is the "track changes" option in Lyx the same as the "track changes" menu in the Overleaf editor?
- Can a .lyx file be edited in a normal text editor?
- Is it possible to open a Lyx file from someone else on the local computer?
- Is Lyx better than MS-Word?

2d DVI
- Is the xdvi viewer able to render .docx files?
- Is there a DVI previewer for ecomstation (OS/2 Warp) available?

2e MS Word
- Does the average audience prefers MS Word or LaTeX generated documents?
- Do you know a Word template which imitates LaTeX?
- How does a LaTeX document looks different from a template based MS-Word document?
- What is the typographic quality of justified paragraphs in MS Word (e.g. poor, medium, perfect)?
- Suppose a dissertation thesis was written in MS Word, is the layout left-aligned or fully justified?

3 GUI
- Is there a GUI for entering mathematical equations?
- Can a table be created graphically?

4 Fonts
- Is it possible to use fonts from the operating system?
- Which steps are required to activate the Linux Libertine Font?
- Is a font size of 10.5pt possible?
- What is wrong with Computer Modern (e.g. not available as Type1, was dismissed over Times New Roman)?

4a Images
- Is it possible to position images manual?
- Can an image be positioned over existing text?
- Can pdflatex handle SVG images?
- What is the purpose of floating an image in the text?

4b Columns
- Is there a three column mode?
- Are the columns register true?

4c Layout
- Does a LaTeX formatted document looks cleaner than the MS-Word counterpart?
- Is it possible to create documents which are looking different from a LaTeX paper?

4d Glue
- Does optimized vertical space between paragraphs improves the layout?
- Are micro typographic extensions a must have?
- Is it possible to implement the Knuth Plass algorithm outside of TeX in a different software?
- Is there an implementation available for the Knuth Plass algorithm in the stack based Forth83 language?

5 Publishing houses
- Is LaTeX used by professional publishers?
- Is LaTeX superior over Adobe Indesign?
- Is the workflow the following: An author sends a Word document to a publisher and then the publisher converts it into Latex to streamline it with the internal workflow?
- Is it normal, that a journal asks the author for substantial rewriting?

5a Preprint server
- Is it easy to submit a LaTeX formatted document to a preprint server?
- What is your opinion about preprint servers which accepting only MS Word docx files?
- Can you name three papers at Arxiv, not created with LaTeX?
- How many academic papers are created with LaTeX (e.g. 30%, 60%, 90%)?
- Is the citation count for LaTeX formatted documents higher than for MS Word equivalents?

5b Document format
- How to make a peer review for a LaTeX document?
- is the .tex file format a zip container which can store text and images both?

6 Justified paragraphs
- How to activate left aligned paragraphs in LaTeX?
- Why is LaTeX recommended, if the rule is to format the manuscript left aligned?
- What is harder to realize from a technical perspective: left aligned paragraphs or fully justified text?
- Does a justified column needs less space, than the left aligned counterpart?
- Can a fully justified column be realized without adjusting the inter word spaces and without hyphenation?
- Has the PDF format the ability to store different word spaces which is needed for a justified line of text?
- Do justified paragraphs are looking great on a smartphone?

6a Sorting game
Please sort the following items by their typographic quality from low to high:
- MS Word fully justified, LaTeX left aligned, LaTeX fully justified, MS Word left aligned.
- Adobe indesign left aligned text, Quarkxpress left aligned text, MS Word left aligned text, LaTeX left aligned text

6b Reasons for justified text
- Is the idea behind the LaTeX software to imitate the Berthold diatronic typesetting machine including it's ability to produce justified text?
- In the time until the year 1990, which percentage of printed newspapers were using justified paragraphs (e.g. 30%, 60%, 90%)?

7 Accessible typography
- Is there a study available which proofs, that fully justified text is easier to read than left aligned text?
- Are documents with a complex layout (multiple columns, justified text, small font size, difficult language, serif fonts) a sign of excellence?
- Was the DVI format created to display text on different screen sizes which includes smartphones and Desktop PCs with flexible line breaking?
- Is a justified paragraph including hyphenation compatible with screenreaders like Jaws?
- is the PNG Graphics format the optimal output device for a LaTeX document (e.g. dvipng -D 300 main.dvi)?
- Does it make sense to use hyphenation on smartphone displays?

7a Checklist for accessible PDF documents
- Are the paragraphs left aligned?
- Was hyphenation deactivated?
- Is it a single column layout?
- Is there a 12pt sans serif font combined with 1.5 linespacing?
- Is the language simple English without complex vocabulary?

7b pdftohtml
- What is the result of using "pdftohtml in.pdf out.html" for a LaTeX generated pdf file (e.g. looks nasty, looks great)?
- Is the pdftohtml tool able to detect multiple columns, removes the hyphenations and fixes the interword spacings of LaTeX so that the generated html file looks great?
- How to convert a random pdf file from Arxiv into plain text?

7c LaTeX and Accessibility
- Can pdflatex generate a PDF/UA file?
- What is the quality of the CTAN accessibility package (e.g. low, medium, high)?
- If LaTeX is only a printer driver and the generated documents do not include structure and tagging, what does it mean for accessibility?
- Which OCR software can convert a pdf document into a LaTeX file?

8 Print typography
- Which program will produce the better print typography (e.g. MS-Word, LaTeX)?
- Which percentage of Arxiv papers are getting printed on physical paper?
- Are printed academic journals the preferred media for knowledge distribution?
- What are the costs to print out a single copy of a book which has 300 pages?

8a Web typography
- Is LaTeX trying to produce web typography which looks great on computer screens?
- Is LaTeX ready for digital publication in the PDF format?

8b Wall of text
- Which paragraph formatting will produce a wall of text (left aligned or fully justified)?
- Can a LaTeX document can have images to reduce the amount of text?
- What is the reader experience for a wall of text (e.g. easy to read, hard to read, unreadable)?


March 25, 2023

The difference between a pocket calculator and GPT-3 enabled Artificial Intelligence

In contrast to AI based software a pocket calculator was understood by the public quite well. Such technology is available at least since the 1970s and lots of books are explaining the inner working. The core element is a CPU which gets programmed in Assembly. And then the machine allows the user to enter a task like “2+4”. From a technical perspective a pocket calculator contains of the CPU which is in easiest case an 8bit model, there is some sort of onboard RAM and very important a program which takes the input of the user and sends instructions to the CPU.
Creating yet another pocket calculator in hardware is easy. Also it is possible to write a python software which emulates a pocket calculator. There are endless amount of software and hardware components available for this purpose and they can be explained easily to newbies. It is more complicated than normal mathematical but the technology is not very advanced.
In contrast, a modern gpt-3 driven Artificial Intelligence works very different to a pocket calculator. Even the system contains of hardware and software components it can't be grasped with the traditional terms used in computer science. Also it is more complicated to newbies what a neural network is about. The inner working of AI can be summarized the following way.
The AI software is able to convert back and forth from natural language into numerical arrays. The input of the user gets converted into numbers, then the system is doing something with the numbers, and the output is converted back into natural language. This number-word engine is the core element in Artificial Intelligence. it allows to solve any problem. Over decades it was unknown how to do so, and it was even unclear if such a transition is needed. Creating word embeddings is sometimes called the symbol grounding problem. For example the word cat is not only a sequence of single characters (C + A + T) but cat is represented in a conceptual space as a number next to other words like dog, mice and so on.
The surprising situation is, that after solving the word embedding problem, it is quite easy to construct a human level Artificial Intelligence. If a problem was reformulated as a numerical mathematical problem, existing computer technology can be applied to it which means the information are stored in the main memory and there are routines which are doing something. The only bottleneck is the transformation back and forth from words to numbers. The core element of any advanced Artificial Intelligence which allows to rebuild the gpt-3 software from scratch is a word embeddings algorithm. Such a software component takes an English sentence and converts it into a mathematical vector. The details of this sophisticated technology are not understood very well and it is a very new approach.

Determine prime numbers with a questionnaire

The classical approach to calculate the prime number works with trial division which is implemented in the programming language by choice on a modern PC. The speed depends mainly on the compiler efficiency plus some handcrafted performance improvements in the source code. The roesetta code website povides a good introduction into the subject [1].
What is missing in this classical approach from the past are heuristics which have nothing to do with programming itself but are reducing the problem space in a general way. A possible option for doing so is a questionnaire formulated in natural language. The interesting situation is, that such an approach has nothing to do with solving the original problem with computer programming but it is about reformulating the original problem. Here is an example:
1. Is the tested number greater 2?
2. Does the number ends with 0,2,4,6,8?
3. Is the number a multipie factor of the prime numbers from 2 to 100?
4. What is the Digit sum (add each single digit)?
5. Is it possible to store the prime sieve from 2 to 1000 on the computer system?
What the algorithm to determine the prime numbers has to do is to answer these questions. The answer is stored in an array in the format [yes,no,yes,26,no]
In response to a answer set the algorithm will choose a certain strategy to determine the prime number. The tool of a questionnaire allows to reference to domain specific knowledge in the context of prime number generation. There are lots of other questions available but for reason of simplicity it makes sense to start with only five of them.
The advantage of using a questionnaire to store domain knowledge is that it allows to formulate the knowledge in a machine readable and in a human friendly format at the same time. The wisdom is divided in chunks which are following the question - answer paradigm. The resulting array of answers can be processed by a computer easily.
References
[1] https://rosettacode.org/wiki/Sequence_of_primes_by_trial_division

Tribute to the FreeBSD operating system

 

In contrast to the Linux operating system, Freebsd is a less known software which should be introduced. The most important period was until the year 1990 in which BSD Unix was the defacto standard for Unix systems. During the 1980s, Linux wasn't invented yet and very advanced projects like the Sun has emerged. Unfortunately, the period since 1990 is less successful for the BSD operating system. First problem was that BSD was splitted into different sub projects like Freebsd, netbsd, openbsd and dragonfly bsd. Second problem was that since the year 1995 the University Berkeley stopped the development of code. So the BSD project has become a non academic project.
FreeBSD itself is running well so it is too early to describe it as a dead end, but many projects closely related to FreeBSD has become obsolete. The mentioned company SUN which was famous for Workstations and the Java technology has stopped its operation in 2010. The OpenSolaris operating system which is working with ZFS and dtrace was canceled in 2009. Another interrupted side projects are PC-BSD (canceled in 2018) and perhaps netbsd which has a very low amount of commits in the last year.
Let us take a closer look at FreeBSD itself. It the largest of the remaining BSD projects and has around 11000 commits per year. In comparison, the Linux kernel has around 74k commits per year. The user experience for FreeBSD on the desktop is disappointing. Most wlan cards are not supported, and the drivers for sound cards are poorly maintained. The user share who are prefering FreeBSD over Linux on the desktop is likely very low. Second problem with the FreeBSD ecosystem is that because of historical reasons the code isn't available under a GPL license which stands in contrast to other large Open source projects like Gimp or the Gnu c compiler.

March 08, 2023

How Artificial Intelligence was discussed in the 1990s

With the advent of Deep learning and humanoid robotics the AI community has demonstrated that everything is possible. Software is able to parse natural language, play video games and can control robots. The surprising success of AI technology wasn't expected by most researchers in the past. There is a difference how AI related problems are discussed in the past from today's perspective.
Before it makes sense to explain more recent algorithms there is a need to a take a look back how AI problems were analyzed 30 years ago in the 1990s. The obvious difference is, that the in past the amount of optimism was lower. In the late 1980s there was an AI winter available, which means that even computer experts were disappointed about the capabilities of neural networks and expert system. The reason why is that during this time a certain sort of questions were asked.
A typical problem discussed in the early 1990s was the n-queen problem which can be solved with a back tracking algorithm. At this time, such a problem was treated as a state of the art Artificial intelligence problem. Many obstacle were available before such a problem can be solved. First challenge was to get access to a reasonable fast computer. A normal Commodore 64 homecomputer was too slow for computer science problems and it was difficult to program the sourcecode. The more effective way in implementing computer science problems was an MS DOS PC in combination with the Turbo pascal programming language. So most of the effort in the 1990s was directed towards getting such sort of PC setup running. SImple tasks like installing the pascal compiler or write a hello world program with the IDE was some sort of advanced programmed task.
Suppose the AI programmer in the early 1990s has mastered all the requirements and was able to write simple programs in Pascal. The next challenge was to explore the direction called Artificial Intelligence. The n-queen problem is some sort of benchmark to test out search algorithms. The task is to find the positions of 8 queens on the chess board by simulating all the possible game state. And the only valid method to solve the problem was complete enumeration. Complete enumeration means to calculate thousands of thousands different possibilities and test if the solution fits to the constraints.
A typical runtime behavior of such an algorithm is, that it will occupy all the CPU resources over hours. The turbo pascal program gets first compiled into a binary .exe file and then it has to run for 3 hours without interruption. During that time the progress is shown on the screen as a percentage number which is counting upwards very slowly. At least in the 1990s such kind of problem solving strategy was state of the art Artificial Intelligence.
It should be mentioned, that in most cases the n queen algorithm hasn't figured the answer to the problem. Either something with the pascal program was wrong, or the state space was too large. The logical consequence was that the problem remains unsolved. That means, technology in the 1990s was not able to solve the problem. If simple n-queen problems remain unsolved, more advanced tasks like controlling a robot are also out of reach for the 1990s software developers.
Complete enumeration and backtracking algorithms are typical examples for non-heuristics problem solving strategies. In the 1990s these algorithms were used as the only strategy to address AI related problems. They are simple to explain and simple to implement. They need a huge amount of CPU time and it can be shown that they can't solve any serious problem. This outcome was used as a proof that a certain problem is np-complete which means, that computers in general are not able to solve it. Even if the computer is 10x faster it is not fast enough to traverse the state space in a reasonable amount of time.
Most of the AI programmers in the 1990s were educated with such a bias. The reason was that more advanced problem solving algorithm were unknown during this time.
 

AI in the 1990s vs 2020s

There are reasons available why AI has evolved over the years. The following table shows a direct comparison of the tools.

early 1990s
2020s
Programming language
Turbo Pascal
Python
Operating system
MS DOS
Linux
Hardware
Intel 286 PC
multi core CPU
Algorithm
backtracking
heuristics
Problem
n-queen
Sensor grounding
Computer books
printed
Internet
There is no single reason why modern AI research works so much better but it is combination of factors. Most software in the 1990s was programmed on slow PC with complicated to use programming languages. In contrast, most programming works under stable Linux like operating system in combination with the easy to handle python language. This reduces development effort. Another major difference is the access to computer science books. In the early 1990s it was nearly impossible to read a state of the art book or paper about Artificial Intelligence. In the 2020s it is pretty easy to find such information online.
The logical consequence is, that over the years the researchers have created more advanced AI related projects. Former challenges are solved and lots of discoveries were made. The chance is high that this development is constant. That means in 20 years from now, lots of improvements are visible. Perhaps future programmers will laugh about current technology like the Internet or the python programming language.
A seldom described difference between AI research in the 1980s vs today are the problem which are addressed. Suppose the idea is to solve the n-queen problem. Then a certain sort of algorithms and programming languages are selected. Before the problem can be solved, somebody has to explain why it is important. From a technical perspective it is possible to solve more recent problems like “sensor grounding” on 1980s hardware and software. That means a simple MS DOS PC in combination with the pascal programming language can be utilized to realize an advanced robot. But, in the 1980s nobody was aware of such thing like the symbol grounding problem. Even the idea to realize a line following robot wasn't invented during this time.
On the other hand it is funny to see, that with modern hardware and software the well known nqueen problem remains unsolved. Even on a quadcore CPU and a 64bit operating system it is not possible to traverse the entire state space to find the correct positions for eight queens.

March 06, 2023

Zettelkasten Questionnaire

Please select a random Zettel from your analog slip box and answer the following 80+ questions.

1 Shape of card
- Is the paper lined (e.g. lined, blank, squared)?
- What color is the paper (e.g. white, blue, light red)?
- How thick is the paper (e.g normal index card 180gsm, thin paper 80gsm)?
- In which condition is the card (e.g. excellent, worn down, damaged by fire)?
- What abnormalities are there (e.g. paperclip on top, glue on back, extra postit notes)?
- Has the card holes on the left for a personal organizer (e.g. ISO 838, 6 ring binder)?

1a Longhand Scripture
- Was the text written in longhand (e.g. longhand, typewriter, PC printer)?
- If longhand, what sort of pen was used (e.g. pencil, fountain pen, roller ball)?
- if typewriter, which sort of typewriter was used (e.g. mechanical, electrical)?
- if PC, which sort of PC printer was used (e.g. inkjet, laser, dotmatrix)?
- What is the color of the ink (e.g. blue, black)?
- Is it readable for others (e.g. very good, medium, poor)?
- Are the characters connected which makes it harder for OCR recognition (e.g. yes, no)?
- Is the ink visible well (e.g. high contrast, fade writing)?

1b Layout
- What is the direction of the writing (e.g. portrait, landscape)?
- Are the lines written in two column mode (e.g. one column, two column)?
- Was a highlighter used for some words (e.g. no, yes in yellow)?

1c Size of card
- What is the size (e.g. DIN A6, A7, 3x5”)?
- How was the paper cut (e.g. pre-cut paper, by scissor, knife at ruler, paper cutting machine)?
- If a cutting machine was used, which type exactly (e.g. safety rotary, guillotine)?
- How many sheets of paper were cut at the same time (e.g. 0, 1, 2, 3)?
- is the card folded like DIN A5 folded to A6 (e.g. no, yes)?

2 Content
- How many words are written down (e.g. 50)?
- Does the Zettel has a picture (e.g. none, diagram, pencil drawing, mindmap, abstract drawing, photorealistic artwork)?
- Is there a table in the body (e.g. no, yes)?
- Is there a writing on the backside too (e.g. no, a bit, very much)?
- What is written on the backside (e.g. bibliographic reference, normal note, direct quote)?
- What is the language of the card (e.g. German, English, Lingua Latīna)?
- Is it a hub note or a normal note (e.g. normal_note, hub_note, bib_note, register_card)?

2a Reason for creation
- At which place was the card created (e.g. in the library, at home, at work)?
- Was the Zettel used for writing a new book (e.g. new book, new novel, none)?
- What is the purpose (e.g. recipe collection, excerpt notes, lecture notes, plotting a story)?
- How many people have seen the card (e.g. only me, one other, many)?
- Was the card used during a presentation in front of an audience (e.g. no, yes)?
- Has the content a practical nature (e.g. practical, theoretical)?

2b Format of Bibliographic reference
- Are there bibliographic references (e.g. 0, 1, 2)?
- What is the citation style of the reference (e.g. [AuthorYear], [1], (a))?
- is the page number written near the reference (e.g. no, yes)?
- Where is the Author-Title-year information located (e.g. on the backside, at bib card, in external database)?
- is the reference stored in a digital database too (e.g. zotero, bibtex, none)?

2b1 Content of bibliographic references
- What sort of reference is it (e.g. a paper, a book, a podcast)?
- Who is the author of the reference (e.g. Mark Twain)?
- Is the reference publicly available (e.g. in the internet, only in printed format)?
- Is there a direct quote from a reference on the card (e.g. no, yes)?
- Is the Zettel judging with a negative connotation about the bibliographic reference (e.g. negative, neutral, positive)?
- What is the language of the bibliographic references (e.g German, English, Lingua Latīna)?

2c Time
- When was the Zettel created (e.g. January 2023)?
- Does the Zettel contains of a written date (e.g. no, yes)?
- If there is a date, how was it created (e.g. manual pen, rubber date stamp)?
- If there is a bibliographic reference, from which year is it (e.g. 2014)?
- Were information added later (e.g. no, with 1 day delay, with 1 week delay)?
- How long does it take to write the Zettel in longhand (e.g. 5 minutes, 10 minutes)?
- How many minutes would it take to type in the Zettel into a computer keyboard (e.g. 4, 6)?
- Dropping the Cettel from 1 meter, how long does it take until it hits ground (e.g. 0.5 seconds, 1 second)?

2d Links
- Are there are outgoing links to other cards (e.g. 0, 1, 2)?
- Are there are incoming links from other cards (e.g. 0, 1, 2)?

2e Named entity recognition
- Does the note contains of person names (e.g. no, yes)?
- If yes, is he male (e.g. male, female)?
- Are locations like a city mentioned in the note (e.g. no, yes)?
- if yes, which continent is it (e.g. Europe, America, Asia, Africa)?
- Are dates available like a year or a certain time (e.g. no, yes)?
- if yes, which century is it (e.g. 21th, 20th, 19th)?
- Are product names mentioned (e.g. no, yes)?
- if yes, what is the price range (e.g. below 100 US$, more than 100 US)?
- Does the Zettel contain of measurement units like kilogram or meter (e.g. no, yes)?
- If yes, what purpose is it (e.g. measure biology, measure physics, social indicators)?

2e1 Part of speech tagging
- Does a sentence contains of a question mark to label a misunderstood subject (e.g. no, yes)?
- How many nouns are on the card (e.g. 10)?
- Is there a personal comment on the Zettel (e.g. no, yes)?

3 Indexing
- Was the card indexed with a tag for faster recall (e.g. no, yes)?
- Are some words underlined (e.g. 0, 1, 2)?
- Has the card a colored tab (e.g. no, yes)?
- What is the general subject (e.g. Math, Art, Business, Sociology)?
- Give two keywords which are describing the Zettel (e.g. Euclidean geometry)
- How important is the Zettel to understand the general subject (e.g. low, medium, high)?
- What is the word count of the title (e.g. 2, 4, 6)?

3a Machine readable
- Was the card scanned for backup reasons (e.g. no, as jpeg file, including OCR)?
- if the card was scanned, what is the dpi resolution (e.g. 150dpi, 300dpi)?
- Is there a barcode or similar identification system (e.g. none, barcode, qr-code, rfid)?
- Has the edge some holes (e.g. edge-notched card, none)?
- Is the information on the card the same as in a digital PKM software (e.g. no unique, yes a copy)?
- Was carbon paper used for write through on a second card for backup reasons (e.g. no, yes)?

3b Luhmann ID
- What is the position of the Luhmann ID on the card (e.g. top left, top right)?
- What is the format (e.g. alphanumeric, IP number like, with slash after first digit)?
- If the ID is nested, how deep is it (e.g. 3 levels, 6 levels)?
- What is the classification system (e.g. self created, Dewey decimal classification)?
- Has the card a fixed position in a stack (e.g. no, yes a juxtaposition)?
- Does the ID contains of a minus - sign to sort the card before the parent card (e.g. no, yes)?