The term itself "symbol grounding" was coined in 1990 by Stevan Harnad, but the subject was researched much earlier. From a very abstract perspective, "symbol grounding" describes the relationship between language and the reality, so its asks basically "what is language?".
Before the advent of computers, symbol grounding was treated as linguistics and philosophy, for exmaple Aristotle has asked in his correspondence theory of truth about the mapping from language to reality. Let me give an example: Suppose somebody says "The apple is located on the table". This sentence describe the physical properties of a food item in the kitchen. It communicates an observation to someone else who is speaking the same natural language.
With the advent of the Microcomputer in the 1980s, the "symbol grounding problem" was researched as part of artificial intelligence. The goal was to use computers to process language. Notable examples are the SHDRLU project (text to action) and the Abigail scene recognition project from 1994 (scene to text). The most advanced example available today is the Wayve Lingo-1 software for controlling a self driving car. This software was designed as a neural network and can understand English language in the context of car driving.
A closer look into the timeline will show, that symbol grounding isn't a single theory or a single algorithm, but there a different approaches available initiated at different decades in research. The shared similarity is the objective to understand language. Language is important for human to human communication but is also important for human to machine communication. It seems that language is the "ghost in the machine" which allows a computer to think and take its own decisions.
The main difference between human and machines is, that machines can process language much faster. In the "karel the robot" project from 1981, its possible to submit a dozens of commands per second to the parser which translates the commands into actions in the simulated environment. Such kind of fast processing can only be realized by a computer not by human individuals. A human might understand and react to a command in the same way but at rate of 1 command per 5 second and sometimes slower.
Here is the entire timeline sorted by year:
3300 BC,Cuneiform writing system in Mesopotamia
1500 BC,sundial showing the time of the day
600 BC,Latin alphabet available in Italy
322 BC,correspondence theory of truth by Aristotle
1386,Salisbury Cathedral tower clock with a bell
1440,printing press by Johannes Gutenberg
1505,Pomander Watch by Peter Henlein
1792,optical telegraph by Claude Chappe
1844,morse code by Samuel Morse
1870,Engine Order Telegraph by William Chadburn
1876,commercial typewriter by Remington
1878,chronophotography "The Horse in Motion" by Eadweard Muybridge
1903,Telekino remote controlled boat by Leonardo Quevedo
1915,Therblig notation by Frank Gilbreth
1915,rotoscoping animation technique by Max Fleischer
1920,AAC Communication board by F. Hall Roe
1928,Labanotation dance notation by Rudolf von Laban
1930,motion tracking by Nikolai Bernstein
1949,Turing test by Alan Turing
1959,Pandemonium architecture by Oliver Selfridge
1962,ANIMAC motion capture by Lee Harrison III
1963,ASCII code
1966,ELIZA chatbot by Joseph Weizenbaum
1968,SHRDLU natural language understanding by Terry Winograd
1971,Lexigram for communicating with apes by Ernst von Glasersfeld
1977,Zork I text adventure by Tim Anderson
1977,Tour model instruction following by Benjamin Kuipers MIT AI lab
1980,Chinese room argument by John Searle
1980,Commentator scene description by Bengt Sigurd
1980,Finite State machine in Pacman videogame by Tōru Iwatani
1981,Karel the robot programming language by Richard Pattis
1983,MIDI music protocol
1983,M.I.T. Graphical Marionette by Delle Maxwell
1984,Castle Adventure by Kevin Bales
1987,Maniac Mansion point&click adventure by Ron Gilbert
1987,Vitra visual translator by Wolfgang Wahlster
1990,Physical Grounding Hypothesis by Rodney Brooks
1990,paper "The symbol grounding problem" by Stevan Harnad
1993,AnimNL computeranimation by Norman Badler
1993,conceptual spaces by Peter Gardenfors
1994,Abigail scene recognition by Jeffrey Siskind
1998,Rocco Robocup commentator by Dirk Voelz
1999,trec-8 Text REtrieval Conference
2003,M.I.T. Ripley robot by Deb Roy
2006,Marco route instruction following by Matt MacMahon
2007,Simbicon computer animation by Michiel Panne
2010,Motion grammar by Mike Stilman
2010,M.I.T. forklift by Stefanie Tellex
2011,IBM Watson Question answering by David Ferrucci
2013,Word2vec algorithm by Tomas Mikolov
2015,Poeticon++ trajectory recognition by Yiannis Aloimonos
2015,DAQUAR VQA dataset by Mateusz Malinowski
2020,Vision language model by different authors
2023,Wayve Lingo-1 self driving car
Perhaps it makes sense to focus on language itself. Language in its core meaning is natural language like English or French. It was invented a long time ago as a tool similar to a hammer or the steam engine but not as a physical device but language acts as a mental tool. Languages are very old innovations, for example the alphabet with 26 characters from A to Z is known for over 2600 years.
The new thing known as the symbol grounding problem is a more technological perspective towards language. Instead of only learning a language which means to memorize the vocabulary, the task is to understand what the purpose is of English. Or to be more specific, how language allows human to think. This question is upto date an unsolved problem. There are some signs avaialble that language is processed by the brain, also its known that artificial neural network simulated by a computer can imitate this behavior. This allows to use machines to parse natural langauge including its mapping towards the reality.
No comments:
Post a Comment