Elon Musk: new interview on autonomy


Elon says he is confident that full self-driving will be “feature complete” by the end of 2019, but will still require human supervision and intervention, like Autopilot. He guesses that the system will be safe enough to lift that requirement by the end of 2020.

At 14:25: “And we’re really starting to get quite good at not even requiring human labelling. Basically the person, say, drives the intersection and is thereby training Autopilot what to do.”

Sounds like Tesla is using supervised learning for path planning and/or driving policy. (Or possibly inverse reinforcement learning. Or another similar technique.)


That’s how I interpreted it as well (re: supervised learning for path planning).

The feature-complete statement by end of year either seems aggressive, in typical Elon fashion, or “feature-complete” is just a loosely defined term…

Good to hear that the new chip is on track to be produced at scale


In April 2017, he said he thought you’d be able to sleep in your Tesla in about two years. So this is him pushing his timeline back a year :laughing:


Waymo is “Feature Complete” and has been for years. Feature complete could be as simple as “identifies instersections” sometimes. “Identifies road hazards” sometimes. “Plans routes through intersections” sometimes. “Sees stop signs and traffic signals” sometimes. “Manages pedestrian interactions” sometimes. etc…

Navigate on Autopilot is feature complete. It picks lanes, it passes slow vehicles, it sets the speed based on speed limits. It takes exits. Sometimes it even picks the right lanes at the right times and takes the right exits. It usually also doesn’t drive into stationary vehicles or highway dividers… usually. :smiley:


The Tesla hacker community has been pretty skeptical that anything like imitation learning is going on because they haven’t been able to find clear evidence of the data gathering or parallel operation that seems to be required. Of course, the hacker community doesn’t see very many vehicles and can only look at data passing between fairly coarse blocks of code, drivers, and kernel elements so what they know with confidence is fairly limited.

Tesla’s (Elon’s) comments in this area have long implied that imitation learning and background evaluation of AP is going on in the vehicles, but the statements are pretty vague and don’t use the kind of language that would let us know what sort of techniques are being employed with any kind of specificity.

These recent comments seem to be some of the most clear cut with respect to asserting that the vehicles are doing more than just gathering snapshots of interesting phenomena. I’m inclined to agree that the proper interpretation is that some kind of imitation learning is going on, though it’s possible that vehicles are just recording the environment and driver responses.


That’s all that’s required for supervised imitation learning/behavioural cloning. You just need the state-action pairs, i.e. the environment and the driver input. If that data is uploaded, Tesla can do the training server-side.

I think the whole debate about “shadow mode” (which I assume you’re referring to here) is a different topic altogether. Elon has framed “shadow mode” as a way to test the safety of the system, rather then train it.


Listening to the interview again, I’m not so sure what Elon was saying. It’s ambiguous. It’s hard to tell when he’s talking about perception, and when he’s talking about action (i.e. path planning and driving policy) — if at all.

This quote sure sounds like it could refer to imitation learning:

And we’re really starting to get quite good at not even requiring human labelling. Basically the person, say, drives the intersection and is thereby training Autopilot what to do.

But I wonder if it could just mean, for example, that if Tesla drivers stop for a traffic light, it is automatically labelled as red. The cars upload a picture of the traffic light, it is automatically labelled as red since the cars were stopped, and those labelled images are used to train the perception neural network. This would be a form of weakly supervised learning applied to computer vision (e.g. object recognition).

An example of weakly supervised learning for image recognition is Facebook training a neural network based on images from Instagram, weakly labelled with hashtags. Instagram hashtags only loosely correspond to what’s in the image. That’s why this is weak labelling.

So, I’m not sure if Elon was talking about imitation learning for path planning and driving policy, or if he was talking about weakly supervised learning for computer vision.


Sure - lots of possibilities here. For instance, labeling traffic lights is really hard, so you take records of 100 people driving through an intersection and you correlate their behavior to visible traffic lights and thereby automatically determine which traffic lights are relevant to which lane - what actions are permitted in what light state and so on. If you have a lot of data then you can brute force problems and still get great accuracy.


I used a machine learning service to make a really bad transcript of the podcast. Useful for finding time codes for quotes. (I’ve also manually fixed some errors.)


New quote from Elon today:

“I’m driving a development version of Autopilot right now, and it works extremely well recognizing traffic lights and stop signs,” Musk added. “It’s starting to make turns effectively in complex urban environments.”

Via The Verge.


HW2 Teslas now detecting stop lines painted on the road.


Slow-motion self-driving in parking lots. :stuck_out_tongue: