January 15, 2020

POV forking of Wikiipedia

On the first look the git tool and the Wikipedia project are working the same, because they are supporting a version history. The difference is, that the Wikipedia project never was forked in his history, only local copies are created. A fork is technique used heavily at github to bypass the original community and start developing a new branch. The main feature of a fork is it's ability to integrate the updates of the upstream. That means, the fork contains the latest information plus extra content.

The subject overall is very complicated. So i have decided to make a simple experiment to test what will happen in the reality. For the first step, the fork is created only on the local harddrive but not in the Internet, and it's not the entire Wikipedia but only a few files. But it is well documented so that other users can reproduce the steps. It starts by creating a new git project in a working directory:

mkdir wiki-fork

git init

touch readme.txt

git add --all && git commit -m "initial commit"

Then the folder is populated with three files from the original wikipedia, With a copy&paste the latest markup-file is created in the directory. What we need also is a branch:

git add --all && git commit -m "create three files"

git branch issue1

The idea is, that the fork is maintained in the issue1 branch while the original project (upstream) stays in the master branch. The merge is done with the following command:

git checkout master

git merge issue1

git branch -d issue1 // delete branch

The idea is, that in the issue1 branch my own individualized Wikipedia version is available in which only i can edit, similar to a sandbox. The created edits are never send back into the Wikipedia but they are merged on the local harddrive into the master branch. The best visual understanding is a github project in which a fork is created. In theory, this allows to the developer to become independent from the original project.

The open question is, how does it work in reality. I have searched at Google for some information but didn't found something. So i have to test it out. The critical point is, that sometimes the upstream will update their content. That means, if the wikipedia community change one of the three files online, i have to update the content in the master too. The problem is, that the information in the issue1 branch are different from that so there is need to merge. It's unclear how often this is necessary. The hope is that a merge is needed only once a week, and that it can be done automatically. But in case of doubt it will result into a merge conflict and it's unclear how to solve it.

What we can say is, that the git tool is a here to stay. It's the most advanced forking / version control system available and was designed with the desired purpose in mind.