The AI revolution started in 2023 is different to the PC revolution in the 1990er because its more complex to understand. There is not a single technology available like a desktop PC but multiple abstract technologies which are resulting into surprising result. To become an overview over the development there is a need to identify key patterns which are common for all possible large language3 models created in the past and in the future.
What these AI systems have in common is, that they working with natural language and they were optimized for benchmarks. The ability of language understanding and the complexity of the benchmarks has been improved over the months. That means, that current LLMs are more powerful than former counterparts.
The surprising situation is, that even before the year 2010 early attempts were made to measure Artificial intelligence. In computer chess the ELO score is used to measure the propability to win a game of chess played by humans or machines. The lastest iteration of chess engines have an ELO score much higher than the smartest human chess player available. Modern AI systems were benchmarked not only with chess elo score but with question answering quizs, image generation benchmarks and öcoding benchmarks. The limitation of an AI is given by the benchmark. If the benchmark isn't able to measure correctly, the underlying AI system which scores in the benchmark has a low quality.
The general pathway towards AI is to create first a benchmark, for example a VQA benchmark, then test existing AI systems in this benchmark, and in the third step, the original benchmark gets improved. Such a development workflow produces a self evolving ecosystem which consists of more complex benchmarks, in combination with more powerful Large langhuage models.
The main difference between former chess playing AI in the 1990s and current LLMs like chatgpt is, that the benchmark is more difficult. Playing and winning a game of chess or tictactoe isn't recognized as a serious challenge. All the existing AI system are able to do so. The newly invented obstacle is to understand documents, generate prose text and generate videos. Such kind of challenge is a serious obstacle for current AI systems.
What makes the situation a bit unusual is, that the newly created / discovered benchmarks are closely related to human's problem solving skills. Recognizing objects in a symbol and answering questions for a document is similar to what humans are doing. And doing physical tasks, like pick&place objects or assembling a car is also closely related to human's daily life. If robots and AI software is able to do these tasks with the same and even a higher precision this is perceived as a technological singularity.
Its hard or even impossible to find a benchmark which can be solved by humans, but not by machines. Only very advanced tasks like driving a car or writing academic papers are not solved by machines yet. Instead of analyzing how a certain LLMs is working internally, the more important question is which sort of benchmarks can be solved by this AI. This allows to get a better picture about the current situation in AI technology.
April 01, 2025
Birds eye perspective towards AI
Labels:
Human level AI
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment