The team’s algorithm, called Dreamer, uses past experiences to build a model of the surrounding world. Dreamer also allows the robot to perform trial-and-error calculations in a computer program as opposed to the real world, predicting possible future outcomes of its potential actions. Because of this, it learns faster than just by doing. Once the robot learned to walk, it continued to learn to adapt to unexpected situations, such as resisting being knocked over by a stick.
“Teaching robots through trial and error is a difficult problem, made even more difficult by the long training time required for such education,” said Lerrel Pinto, an assistant professor of computer science at New York University who specializes in robotics. and machine learning. Dreamer shows that deep learning and world models can teach robots new skills in a very short time, he says.
Jonathan Hurst, a professor of robotics at Oregon State University, says the findings, which have not yet been peer-reviewed, show that “reinforcement learning will be a cornerstone in the future of robotic control.”
Removing the simulator from robot training has many benefits. The algorithm could be useful for teaching robots how to learn skills in the real world and adapt to situations such as hardware failure, Hafner says, for example, a robot could learn to walk with a faulty motor in one leg.
The approach could also have huge potential for more complicated things like autonomous driving, which would require complex and expensive simulators, said Stefano Albrecht, an assistant professor of artificial intelligence at the University of Edinburgh. A new generation of reinforcement learning algorithms could “pick up super-fast in the real world how the environment works,” Albrecht says.