March 22, 2026

Instruction following for videogames

The most advanced AI model yet is "Google Sima 2" from the year 2025.[1] The most remarable element isn't the inner working of the agent but its interaction with a human. Its basically a teleoperated video game based on natural language. The human operator enters a command and the AI is translating the text into action.

In other words, the AI itself isn't running a sophisticated algorithm and tries to win any game, but the AI is a text to action converter known as "instruction following". Similar to a large language model the working thesis is, that natural language is the key element for abstract thinking. Describing the next action of character in a videogame in English provides an abstraction mechanism. Instead of mathematical state space used for early chess games based on alpha beta pruning, the "Sima 2" agent utilizes natural language as communication tool.

Of course the input command can be generated by human operators and by large language models both. In the second case the agent will be controlled autonomously like a fully working ingame AI. So the human operator enters only the general command like "win the game" and the large language model will generate all the sub commands and submits them to the Sina 2 agent.

Let me explain the mechanism from a different point of view. Suppose the parser gets removed from the Sina 2 agent so he won't no longer understand natural language. In such a case, the AI can't do anything. It is not able to play the game or solve very simple puzzles. In other words, the intelligence of the AI depends entirely on natural language understanding. The agent needs to know what the term "jump" is about and needs to localize objects like "wall" and "coin" in the game. The software is utilizing the power of English which consists of verbs, adjectives and nouns to interact with videogames.

[1] Bolton, Adrian, et al. "Sima 2: A generalist embodied agent for virtual worlds." arXiv preprint arXiv:2512.04797 (2025).

No comments:

Post a Comment