May 08, 2019

Describing Social networks from the outside


Many books and videos were created which are explaining what social media is and how to use it right. The problem with these tutorials is, that Social Media itself has no content and as a result it can't describe how the shape is looking like. A typical newsstream in a social media group contains only a very few of bytes. Even the complete social network database can be compressed with the zip archive format to a small file. And in the file is no real content included. What a social network group is about, are small comments, URLs and a boolean value if a like button was pressed or not. Trying to understand this information is not possible.
The better approach to get a picture what a website like Google+ is doing we have to introduce the term “news aggregator”. The term is defined in the literature clearly. If something is aggregated, it needs something which is much bigger. A combined RSS feed for example needs before it can be created some external feeds. The question is not, what the users are doing inside a Google+ group, the question is how looks the internet outside of Google+.
Let us try to describe what the internet is doing from a live perspective. According to the latest (faked) statistics, each day around 4 million blogposts are created, 150 bilion emails are send and 70 milion photouploads were done.
Additionally, each day around 2700 books were published in the United states, and around 2700 songs were created each day. Content aggregation is the art of combinging all these activities into a single stream of information. It can be understood as a playlist in which the URLs are stored from the last day, back to the last week. Everything which is older is no longer a news, but history.
From a technical perspective it is possible to create a worldwide RSS newsfeed which contains all the new blogposts, all the new published books, computergames and uploaded photos. But such a stream would have too much information. The first idea is make the topics more readable is creating groups. All large social networks are using this trick. This helps a lot in reducing the overall stream into smaller parts. A second idea is, to leave out as much as possible, this is done with a voting system and by tricks used by the admins to ban users. If only a small number of people is allowed to make the news headline, it is much easier to focus on the important topics.
It is not a conincident, that the offline gatekeepers like TV stations and printed newspapers are very active in social networks in the internet. Because social networks are working with the same principle an offline medium is working. The interesting point is, that the resulting RSS feed can be created by different techniques. Each person, each newspaper is using another kind of priorities. What remains the same is the problem to crunch all the information created outside the social network into a handy stream which is the social network group.
A fulltext search engine doesn'T have this problem. Because a fulltext engine is catalogizing all the information. There is no need to decide if a certain website should be crawled or not. Instead all the information is collected. The information overload problem for search engines is not there, because the frontend of Google looks always empty and apart from a searchbox no needsfeed is shown.
Social networks are working slightly different. The idea here is to provide a newsstream. Fulltext search is possible but it has smaller priority. Instead the newsstream is curated linear, that means it contains of 100 entries and the reader has to read them from top to bottom. What social media are doing is to decide and argue about the playlist. The question is, which information is more important. The principle is known from the music industry and is called charts. A music ranking is a selection of all the songs.
Modern software like online forums, wikis and voting systems can help to coordinate large groups in manually building a newsstream. In contrast to the google index, the newsstream in a social network isn't created automatically, but with manual intervention. And if more people are member of group, the intervention is increasing. This is equal to a higher traffic. Let me give an example:
A single pdf file which contains an article is 3 MB in size. The file is stored on the original server and as a mirror in the Google index. The traffic for this file is zero. That means, no one is reading the file, instead the file remains on the server and waits for it's audience. Now the URL is posted to a Google+ group. Before this can be done, the poster has read the file to make sure that the url is valid. During this access the first 3 MB size were transfered over the internet. In the group two other users have read the URL and clicked on the adress. This results into additionally 2x3=6 MB traffic. They come both the conclusion, that the file is boring and the traffic goes back to zero.
The filesize of the pdf documents remains constant but the amount of traffic which is produced by the file depends on the decision of the humans. If the group comes to the conclusion, that the file is interesting for them, the traffic will grow fast. This decision making process has nothing to do with technical settings, but with the behavior of humans. What computer servers are doing is to react to the demand. That means, if a webserver gets a request he can deliver the file.