I believe the demo at 33:20 in this video shows Mobileye’s reinforcement learning system in action:
It’s an impressive demo. This has flown under my radar, but Mobileye is already testing reinforcement learning-based path planning in the real world.
It seems to me (as a casual, outside observer with limited knowledge of the technology) that path planning is currently the primary bottleneck to progress on autonomous driving. Hand-coded elements of path planning systems require humans to introspect and theorize about the tacit knowledge that enables them to drive a car. Then to attempt to operationalize that tacit knowledge in a programming language. This is inherently a daunting problem, and requires a slow iterate-and-test cycle as software engineers tweak the code, send out safety drivers, wait, collect reports, tweak the code again, and repeat.
Trying to solve path planning using hand coding might be:
- intractable
Or, even if it isn’t, it might be:
- excruciatingly slow
Tacit knowledge is not something that, historically, humans have had great success in coding into robots. For example, hand coding has mostly failed to get robot hands to manipulate objects like human hands do. Recent progress has been made using machine learning, and it’s still an unsolved problem for many applications, such as packing Amazon boxes in warehouses or general assembly of Model 3s at the Tesla factory.
The tacit knowledge is instantiated in our brains, but that doesn’t mean we can articulate it explicitly in a natural language (like English) or a programming language. By analogy, we have all kinds of tacit knowledge about cell division, immune response, protein folding, and so on in our bodies, but we could explicitly articulate nothing about them for thousands of years.
Having a well-working human body doesn’t automatically impart scientific knowledge about biological systems, and having a well-working human brain doesn’t inherently impart scientific knowledge about cognitive systems. Decades or centuries of scientific work might remain before we can articulate an explicit, step-by-step explanatory theory of how humans drive that is detailed enough that it could be implemented in a robot. Software engineers at Google might not be equipped to solve this problem through introspection, intuition, folk knowledge, and on-the-fly theorizing.
But machine learning provides a hope. We don’t have a complete scientific understanding of human vision, but deep supervised learning has allowed us to solve some vision problems without fully understanding either human vision or computer vision. It therefore makes sense to try machine learning on other problems where human success relies on tacit knowledge and where scientific knowledge is lacking.
Since we don’t fully understand reinforcement learning, and since we also don’t fully understand the intricacies of human path planning, we can’t say in advance whether reinforcement learning will be capable of human-level path plannning. We can only try it and hope that eventually it’s as successful as supervised learning has been for vision.
Mobileye’s work is encouraging because it is an early, limited proof of concept that reinforcement learning can do path planning well, in simulation and in the real world. With approaches to path planning that rely on large hand-coded elements, I worry that self-driving cars might never come to fruition, or might take decades to become competent drivers. With machine learned path planning systems, there is the hope of rapid progress and sudden breakthroughs that enable the technology to be commercialized at scale in the near future.