Robotics and Artificial Intelligence: Content aggregation in the Web 2.0

According to Dieu, Barbara, and Vance Stevens. "Pedagogical affordances of syndication, aggregation, and mash-up of content on the Web." TESL-EJ 11.1 (2007): 1-15. the Web 2.0 contains of different websites:

- blogging hoster (wordpress)

- photohoster (flickr)

- quick messaging (twitter)

- mapping (communitywalk)

- podcasting hoster

- videohoster

The paper is bit outdated because recent developments like Youtube and Instagram are not mentioned, but in general the list of hosting providers make sense. These websites have in common that the user can upload a file from his local computer to the internet. After the file was submitted it is available under a URL.

On page 4 of the paper an additional category of websites is mentioned which should be treated in a extra category. It is called “social networks”:

quote: “These are integrated social platforms designed to allow people to easily connect and share resources.”

And often social networks are described as the most important hub in the Web 2.0 which is superior to normal hosting websites. Why? The article itself doesn't provide an answer, but it gives some hints into the direction. It has to do with RSS feed, content aggregation and social sharing. Let us ignore the paper and try to answer the question by it's own. The list of webhosting providers make sense. If somebody has created a video he has to upload the video to a hoster. IF someone has created text he can store the text in a blog and so on.

But how ca all these sources be combined into something which is useful? What the individual user has uploaded is known, what is unknown is, how the other side will select the information stored in the internet. Social networks like Facebook are answering this question. Without a social network, the content in the Internet won't produce traffic.

Perhaps we should define what a forum is. The paper doesn't have categorized. IMHO a forum is similar to a wordpress blog a normal website. A forum was not invented for content aggregation. Only social networks and aggregated newsfeeds are able to do so. A forum can produce a RSS feed with all the latest postings, but this feed has to combined with the feeds from blogs, from twitter and from videohosters to an aggregated feed.

In the picture no search engine is available and it is not needed. Instead the RSS feed is produced in a linear way. That means, the user doesn't type in keywords, but he is scrolling his feed from top to bottom. Some beginners may think, this principle is equal to Facebook. But the correct term is simple “content aggregation”. It means to combine existing RSS feeds into a new one, which contains more entries.

It is important to know, that the sources on the left (the hosting websites) are not deliver the content of the user in the right section. It makes no sense for a user to ask for files he has created by himself. He knows what is written in the own blog. What the user is interested are information created by other. Describing the situation is a bit complicated. So we have to change the point of view first. Social networks can be described without focussing on the user. Instead we have to focus on the content sources. I'd like to give an example.

Suppose there is a virtual community around Lego mindstorms. Content which belongs to the community are textblogs, photos, videos, podcasts and forum postings, In the picture the content is stored on the left side at the hosting websites. The problem of the Lego mindstorms community is to connect to each other. That means, the community needs comments, needs an audience, needs more members and needs new content. Basically spoken, the community “lego mindstorms” has to use the Planet aggregator software, build a RSS feed of all the content of the last week and put the stream on a website. This is equal to building a frontpage for the community. Such a stream can be consumed by users.

Now we can explain the idea in an example. The planet gnome website is an example for a virtual community. On planet gnome the user finds the latest blogposts, the latest podcasts and the latest forum entires of the Gnome community. A Google+ group is working with the same principle. In a a Google+ group about Lego Mindstorms, a news feed is shown which is the result of a combined RSS feed. It contains, text posts, videos and pictures.

Like i mentioned before we can ignore the user in a social network completly. Because a virtual community is not interested in deliver the content to a certain audience or getting comments from a certain user, but what the community is asking in connection itself.

Let us take a look on a typical social network page. What the reader gets is a combined newsfeed from different sources. In a Google+ group about Lego mindstorms, he gets blogposts from different Lego mindstorms blogs, he gets some videos and photos. All the content is ordered chronological. That means a Social network group is the ideal place for a community to present themself to the public. In theory it is possible to get access to the content without using a social network. For example by entering a keyword into google and search for a certain blog. But then the content is isolated. A certain blogposts is not connected to a similar podcast. This is not desired by the community and not wanted by the user.

The concept of a social network and aggregated content has a large disadvantages. It results into information overload. It is very easy to create an information stream which contains to much information from to much sources. It is important to know, that the content on the left side in the image was not created by a single newspaper or a single blogger, but it contains the content of different bloggers who doesn't know each other. The resulting RSS feed is more powerful than a normal feed from a single blog. Powerful means, that it contains to much details and a normal user can't filter the information.

Let us describe the situation a user sees Facebook the first time. The common behavior is, that the user is impressed. He gets all the information he needs and the feed is updated continuously. That means, a social network is much better than a normal internet. In contrast to a forum it connects different websites and allows community building. The big problem of social networks is, that most user are the victim of the information. They get high quality information and this makes them angry, fearful and panic.

Sometimes the emotional reaction of Facebook users is located within Facebook. But the same effect is visible on websites like planet gnome. If a user scrolls through the latest news he can become angry too. So it has nothing to do with a certain social network but information overload is the result of content aggregation. If highquality content from seperate websites are mixed into a single stream the user will become a victim of the content.

One possible answer is to avoid any kind of social networks and use a normal fulltext search engine. This strategy helps to reduce the information overload problem. The user gets only information he has asked before, and is in control of the filter. A second option is to improve social networks with a different filter mechanism which allows the user better to decide which content he needs and which not.

playlist

The good news is, that content aggregation can be reduced to a simple question: what is in the playlist? A playlist is a textfile which contains URLs to external websites. It contains of a date, a URL and a tag. Such a list can be small or it can be larger.

What social networks and content aggregators are doing has to do with creating and updating such playlists. The result is rendered on the screen and the user sees a preview image. Because content aggregation and producing a playlist is the same the task can be simplified to the problem of how to create a URL list.

Suppose the resulting URL list has 1000 entries contains of 40kb. The good news is, that apart from this simple list nothing else is needed for a social network. If the URL list is not available the content aggregation has failed, and the user has to search manual for the blogs and textfile.

A lot of technologies are available to edit 40kb large textfiles. The simplest one is a texteditor. A more advanced technique is a Excel spreadsheet. The most advanced technology I'm aware of it to edit a 40kb textfile is a wiki. That means, the textfile is uploaded to a wiki and different users can make edits at the list in realtime. No current social network today is using a wiki system to create URL lists. What Facebook and Google+ are using is an in transparent voting system (the famous like button), they are using admins which can ban users from a group and they using mythical algorithm which will add new URLs to the list if a certain trigger is active. Especially the last mentioned technique to add with a random generator new URLs to the playlist will result into information overload. If the user can't explain why he sees a certain picture on the screen he is not in control of the situation. He was fooled by the Facebook stream and a troll in the background laughs about the victim.

To describe the phenomena from an abstract perspective the first step is to recognize that a social network contains of playlists which are containing URLs to content. The second step is to ask how the lists are created and especially how a group of people can generate such playlists.

The advantage of wiki for editing URL playlists is, that the entire process is visible. Each edit is stored in the version history. And if the URL list contains some weired links it can be traced back who exactly has added the linkspam. If the process of URL curation is made transparent we can discuss how the workflow is to creating such a list. Should an upvote/downvote system be integrated? Should the user has the right to comment a link? Can the list be modified with bots? And if yes, what is the sourcecode of the bot who adds URLs?

These question can't be answered in existing social networks like Google+ or Facebook. Because they are denying that a uRL playlist is in the system available and they doesn't provide transparency who the list was made. The only way to improve the situation is a wiki. If a wiki is used for creating the playlist, it is possbiel to argue with admins in the wiki what they are doing.

In theory, even an edit war around a URL playlist is possible. If two admis have different opinions they will revert the edit of each other. The good news is, that it's known from other WIki projects, that even edit wars doesn't hurt the overall structure. Instead, edit wars have the tendency to escalate inside the wiki. Which means, that the issue is put to the discussion page and a higher ranked admin will ban one the users.

Robotics and Artificial Intelligence

May 09, 2019

Content aggregation in the Web 2.0

playlist

No comments:

Post a Comment