In addition to the previous demonstration with language guided robotics, here is another prototype for a maze robot. The NPC in the game generates a random command like “move to topright” and the human player has to fulfill the task. This time a numerical reward is given in the form of the Manhattan distance and a visible marker for the target zone is available.
The core element of the game is the datastructure with possible instructions which maps language to action:
self.instructions={ # id, name, targetpos, link
0: ["walkto topright",(9,0),4],
1: ["walkto leftcoin",(4,5),2],
2: ["walkto rightcoin",(9,5),1],
3: ["walkto enemy",(3,1),None],
4: ["walkto topleft",(3,1),0],
5: ["walkto bottomleft",(0,6),4],
}
0: ["walkto topright",(9,0),4],
1: ["walkto leftcoin",(4,5),2],
2: ["walkto rightcoin",(9,5),1],
3: ["walkto enemy",(3,1),None],
4: ["walkto topleft",(3,1),0],
5: ["walkto bottomleft",(0,6),4],
}
In contrast to classical Reinforcement Learning, the robot in the game isn’t controlled by a computer program but the human operator controls the robot. On the other hand, the Non player character (NPC) which is the referee is automated in software. It decides by its own what goal is next and what the reward is.

No comments:
Post a Comment