Some fascinating stuff on behaviour generation (a.k.a. planning a.k.a. decision-making a.k.a. path planning and driving policy — so many synonyms!). If I understand correctly, Aurora is using imitation learning (a.k.a. learning from demonstrations) within a framework of formal constraints similar to Mobileye’s RSS or Nvidia’s SFF. This sounds really cool!
I dig this slide:
This is what Tesla plans to do with 600,000+ human drivers… Although there is a chicken-and-egg problem in that you have to train the neural networks to a certain level of safety before you can deploy them fleet-wide to be trained more.
Unless… Rather than training from interventions, you can train from passive observation by comparing the behaviours generated by your imitation learned model to what human drivers do when they’re in full manual mode (i.e. Autopilot off).
Another takeaway: it seems like a lot of major companies working on self-driving cars — including Tesla, Waymo, Mobileye, Uber ATG, and Aurora — are now working on machine learning approaches to behaviour generation, whether that means imitation learning, reinforcement learning, or both. (This is also true for smaller startups like Wayve, Ghost, and Comma. Maybe Pronto too.) I wouldn’t be surprised if other big companies like Cruise, Zoox, and Baidu are also working on IL or RL.