Right now, our system can successfully perform a relatively complex human-level task about 85% of the time. This includes letting the robot automatically try again if it recognizes that it has failed at a specific behavior. Each task is made up of about 45 independent behaviors, which means that every individual behavior results in success, or recoverable failure 99.6% of the time.
The three tasks are: get a bottle from the fridge, get a cup from the dishwasher, and put a cup or bottle on a table or in a cabinet.
We teach the robot using an immersive telepresence system, in which there is a model of the robot, mirroring what the robot is doing. The teacher sees what the robot is seeing live, in 3D, from the robot’s sensors. The teacher can select different behaviors to instruct and then annotate the 3D scene, such as associating parts of the scene to a behavior, specifying how to grasp a handle, or drawing the line that defines the axis of rotation of a cabinet door. When teaching a task, a person can try different approaches, making use of their creativity to use the robot’s hands and tools to perform the task. This makes leveraging and using different tools easy, allowing humans to quickly transfer their knowledge to the robot for specific situations.