I’m just going to copy/paste what I wrote about the Waymo talk on Tesla Motors Club:
What a yummy talk! I loved it!
Drago seems to think the same way as Karpathy. There is an ML portion of the software stack, and a non-ML portion. The goal is to expand the ML portion of the stack, so that over time it takes over more and more responsibilities from the non-ML portion. It is comforting that they seem to agree on that.
I loved the discussion of supervised imitation learning (ChauffeurNet) and inverse reinforcement learning (trajectory optimization agent). When Lex Fridman asked Pieter Abbeel, an imitation learning and reinforcement learning expert, about imitation learning for self-driving cars, he recommended inverse reinforcement learning over supervised imitation learning. It was really cool to see Drago give a comparison of both approaches.
I found it really interesting how he talked about populating simulations with a zoo (or menagerie) of agents. Including recorded “ghosts” that don’t react at all, agents that use a simple brake and swerve algorithm (makes me think of cars in GTA V), and multiple varieties of complex machine learned agents. A mix of different kinds of agents will increase simulation diversity, and that could make simulation training generalize better to the real world. This is similar to the idea of simulation randomization, which can help compensate for the “reality gap” between simulation and reality. Except that the differences between e.g. supervised imitation learned agents and inverse reinforcement learned agents isn’t random.
The discussion of training examples and the long tail of driving scenarios seems the most relevant to Tesla. Suppose there’s a scenario that occurs every 10 million miles. If Waymo’s fleet drives 10 million miles per year, it will encounter 1 training example per year. If Tesla’s fleet drives 10 billion miles per year, it will encounter 1,000 training examples per year. A scenario that occurs every 100 million miles: 1 per decade for Waymo, 100 per year for Tesla. Tesla can train its neural networks on a much longer tail of scenarios than Waymo. Where Waymo has to use hand-coded algorithms due to a lack of training examples, Tesla can use a machine learning approach.
Drago seemed to say that the main (only?) constraint to applying machine learning in the areas where Waymo currently uses hand-coded algorithms is the difficulty of collecting training data. If this is simply a question of cars on the road, then this supports the thesis that Tesla has an advantage in applying machine learning to autonomous driving.
Alphabet probably has an advantage with regard to neural network architectures, since Google and DeepMind seem pretty good at designing them. But if you don’t have the data to train your neural networks on, then it doesn’t matter.