VeriDream: strengthening teaching for robots through dreams
Artificial intelligence helps robots to successfully perform tasks autonomously. They can learn and train while disconnected and work through simulations as if they were dreaming. However, differences with real-life situations and the infinite number of choices that can be made in real-life limit the effectiveness of these methods. The VeriDream project, supported by the European Innovation Council (EIC), is determined to alleviate this problem.
A new European project aims to make robots dream although no one knows for sure if electric sheep are involved. Stéphane Doncieux, a professor at the Sorbonne University and deputy director of the Institute of Intelligent Systems and Robotics (ISIR, CNRS/ Sorbonne University), is leading the French contribution to the VeriDream project1. This large-scale project has received funding from the EIC. "We help robots to learn through simulation phases," explains Stéphane Doncieux. "This consolidation is disconnected from reality and plays a role attributed to sleep and dreams in humans."
VeriDream follows on from the Dream project2 which ran from 2015 to 2018 and used open-ended learning methods to help robots adapt to different situations. "We tell robots what to do but we don't tell them how" explains Stéphane Doncieux, who coordinated the initial project. "It is up to the robots to explore the most appropriate sequences of actions to find the right solution." This system is based on the principle of rewards for robots that make the right choices in order to gradually establish a policy.
However, the level of effectiveness of the robot's behaviour really depends on the quality of its system representations, namely the information it gathers from its environment. Each machine also has its own physical limitations regarding the actions it can actually perform. Moreover, when robots are faced with too many possibilities they actually find it difficult to make choices and adjust those choices.
Therefore, the Dream project took its inspiration from human and animal development while adding a reinforcement learning phase with simulations. Robots can train on the programme without being turned on which saves considerable time and money while also avoiding wear on the machines. Robots with arms were given tasks consisting of simple forms of manipulation of objects like throwing or pushing…
Des robots qui apprennent (projet DREAM)
VeriDream has taken up the same principle as the Dream project, but has opted to apply it to an industrial context. "Learning always takes place in a certain environment and simulations do not necessarily correspond to the real conditions the robot will be operating in," explains Stéphane Doncieux. To solve this issue, researchers will attempt to automatically detect failures in manually defined policies before the robot is even confronted with them.
"We test a policy generated beforehand then disconnect from reality to analyse what happened and explore new alternatives," continues Stéphane Doncieux. This work is based on evolutionary methods which test a neural network's behaviour without knowing what the ideal solution might be. The algorithms generate variations of a policy and then select the most interesting among these. This is very different from the supervised learning method currently very much in vogue in which we know what a network should do and correct it accordingly.