March 22, 2020

Can Wikipedia be forked?

The entire Wikipedia is too large to create a fork. The project has over 20k users and building a second encyclopedia from scratch would take too much manpower. But, if the aim is to fork only a single category, for example articles about Artificial Intelligence, a fork isn't very hard.

Suppose a single user creates 30 edits per month with a size of 1000 byte each. And the fork contains of 10 users who are working in parallel. After 5 year the project has generated 9000 article with 2000 byte for each of them. And after 10 years the small team of 10 users has produced the same amount of content which is available in real Wikipedia.

A good starting point of a Wikipedia fork is to submit new articles no longer to the Wikipedia itself, but only to the fork. The list with requested articles about AI is located at https://en.wikipedia.org/wiki/Wikipedia:Requested_articles/Applied_arts_and_sciences/Computer_science,_computing,_and_Internet#Artificial_Intelligence The content isn't written yet. But it can be created from scratch and then the article gets uploaded to the fork wiki. The bottleneck for the project is to motivate some users to participate. In most cases the users are only interested to upload content to Wikipedia but not a fork, because the clone has a smaller amount of pageview and no working copy editing team which is correcting spelling mistakes and moderates the process.

On the other hand, the content of the original wikipedia is overestimated. The articles in the AI sections contains of around 50 flagship articles with 50k bytes, and the rest has a poor quality. It's possible to build something which works better from scratch. That means, without take the existing content as starting point but create everything from scratch which will result into the lowest possible copyright conflict.

The only thing what is harder to fork is Google Scholar. Google Scholar and the underlying full text repository contains of 50 million academic papers. The AI Section in Google Scholar has around 1 million papers written by scholarly authors. Writing this content from scratch is very complicated and would take large amount of time and manpower. In contrast, the WIkipedia project is some kind of slideshow community. The users are creating overview snippets for existing academic full text paper in the hope that this is attractive for a larger audience.

The reason why academic publishers are not motivated to engage in Wikipedia is simple: because the project is trivial. Trivial means, that the amount of ressources which are required to build an encyclopedia is low. The entire WIkipedia which contains of all articles can be run with around 10k people. If the aim is to build only a subpart of the project about a single academic topic, for example artificial Intelligence, the amount of needed ressources are around 10 persons who are creating the content from scratch. That means, academic authors are able to build their own encyclopedia from scratch without copy&paste a single sentence. They are writing all the articles from scratch with less than 100 users in a short amount of time.