May 06, 2019

Building a social network website from scratch



Many people are fascinated by social network websites like Facebook, Twitter and Reddit and they want to understand how to clone such a website. If somebody is able to clone a product or a website he has understood the inner working. Cloning is some kind of reverse engineering with the aim to open the black box and analyze what is hidden inside.
The first step in building a social network from scratch is the technical side. What is needed here is a software like Elgg or Humhub which are open source projects with the aim to build a social network in the intranet. That means, the users are not dependent from large scale websites like Facebook but they are creating their own space hosted on the own server.
But is the Elgg software similar to a social network? After installing the program on a server the system is technically working fine but something is missing which are users and traffic. From an abstract perspective a social networking site allows to post comments similar to a forum website. But what exactly is the difference between a forum software like phpbb a blogging software like Wordpress and a social networking software like Elgg? From a technical perspective all of them are working similar. In the backend there is a SQL database and in the frontend is a PHP script which generates the HTML form shown on the screen. The user types in something and the information is stored in the database.
This allows to reduce a social network tool to it's core feature. There is no need to install a software like Elgg, but a simple python script which has access to a database can do the same job. It seems, that social networks are not based on a certain scripting technology, but it's about the user's behavior.
Suppose it's possible in an intranet to make the own rules how the users should act. What are the soft rules everybody has to follow to build a social network? I would guess this is the more important recipe. To moderate the users into a certain direction. The good news is, that in an intranet the technical admin is able to ban users if they are ignoring the rules. That means, the admin is in the comfortable position that his rules have to be fulfilled. The follow up question is which kind of rules are needed for a social network? If the admin doesn't know the project won't work successful.
I have researched the topic a bit. The common social rules which are equal to Twitter, Facebook and Reddit is, that these websites understand themself as a blog aggregator. This self-description allows to formulate the rules for the Reddit clone:
rule 1: it is forbidden to post longer content. That means, if somebody tries to upload a 1 MB pdf file or is posting a 100 kb textfile to the social network he gets banned.
rule 2. What the users are allowed to post are URLs, plus a short description whats behind the URL.
rule 3: the posted URL should fit to the section for example Artificial Intelligence, and the URL should be fresh. That means it is referencing to content created one month ago.
rule 4: if somebody posts URLs which are not fitting to the domain or the URL is outdated the user gets banned.
These simple 4 rules are able to replicate a social network in the intranet. If the admin checks if the rules are respected by the users, the newly created social network will look similar to the large websites (Facebook, Reddit, Twitter). Perhaps the intranet social network will not be exact the same, because the number of users is smaller and because the real reddit website is more attractive, but the general idea is the same.
Sometimes, social networks are described as social tagging websites in which the users can post links. This description is not sufficient. Because a social network doesn't has an internal structure, but what the users are doing in the network depends on the content located outside the network. Before somebody can post a URL he has to know the URL first. And before somebody finds a URL in the Internet, somebody else has to put content behind the URL. The better description of what social networks are about is a blog aggregator. It is some kind of overview websites which is monitoring the blogs which are updated recently.
Detail features like upvoting, downvoting and commenting links from other are additional feature which make the social network more interesting. Also the feature to preview the content in a smaller window and a strong focus on entertainment / funny subject will result into an improved social networking site.
What we can say for sure is, that a social network contains URL to external content which is posted by a group of people and sorted into domains like “fun, games, party pictures and so on”. IN case of Facebook the situation is not so strict, because the user can upload images directly to facebook, so it has the features of an image hosting website, while at twitter the postings are not organized in groups but around people who have written a message. What the smalles common definition is, that social networks are monitoring the blogosphere. It is some kind of realtime search engine created by humans. Usually, the amount of interaction on these websites is higher than in a normal forum. A forum collects information posted by the users. A forum is not monitoring external content.
In the academic community social networks are called an overview paper. An overview paper doesn't present new information about a topic, but it is reviewing exsting content which is already there. The typical overview paper contains of 300 references, but sometimes up to 1000 references. In contrast to a social network the overview paper can't commented in realtime. The shared similarity is, that social networks and overview papers are trying to stand on top of the community. They are in a higher position and make jokes and recommendations about existing information. In most cases, overview papers are written by experts of an domain who has read all the information and is able to give the context information which makes it easier for newbies to identify relevant information.
The interesting point is, that even in the age of search engines, overview papers remain important. They will answer the question which topic is interesting and why. In contrast, a search engine makes only sense if the user is an expert already and knows the keywords to enter.
Cloning Reddit
In the introduction it was mentioned, that before a website can be cloned it has to be understood. Suppose the idea is to create a Reddit like website in the own intranet which is generating a lot of traffic. The rules for doing so are:
1. the users should post URLs plus small descriptions
2. the links should be fresh and referencing to content created less than a month ago
3. the only allowed topic is “funny”, “jokes” and “computergames”
4. many users can login into the site, discuss and upvotes the postings of other
If all these rules are respected strictly the intranet website will develop into the same direction like the Reddit example. It will become a high traffic hub. The reason why is of the mixture of some features. At first, it is not very complicated to post a URL into a website. It can be done in under 2 minutes. This motivates the users to do it very often. Secondly, the topic is jokes, fun and computergames which is attractive to 100% of the users. and third the users can post without asking before so they are in control of the website.
All these features combined makes it likeley, that the website is perceived as useful and will attract new users to participate. The success can be increased if a bit marketing is done for the project, if the technical infrastructure runs stable and if some starting posts are available before the first users are invited to become active.
Now it is possible to describe some counter rules which will result into lower traffic on the social network. Counter rules are:
1. funny topics and jokes are forbidden, computergames too. Only hard scientific topics and programming URLs are welcome
2. the need for posting fresh url is reversed. Instead it is only allowed to post URLs older than 1 year.
Both rules combined make the website less attractive for the community. Perhaps some of the users will participate in the project too, because they have no need to talk about funny thinks or jokes. And perhaps they find the idea interesting that the URL should be not fresh. But in general the amount of users will become slower. The result will look different from reddit. It will become a low frequency social network which is missing something.