My theory or hypothesis is that Tesla — thanks to its fleet of 500,000 cars with full autonomy hardware (likely to reach 1 million cars next year) — will be able to sense, filter, collect, and label neural network training data on a massive, unprecedented scale: the scale of billions of miles, or tens of thousands of years of continuous driving. I predict that this massive-scale data operation will lead to the development of an autonomous driving system superior to what’s been developed by Waymo, Cruise, Argo AI, or any other company.
So, when will this happen? Or rather, when will my prediction be tested? According to two retail investors who attended Tesla’s Auronomy Day, Tesla only started in earnest on software development and neural network training for city driving this year. At Autonomy Day, Andrej Karpathy mentioned that Tesla hasn’t yet collected camera data on snow. I recall Karpathy also discussing some aspect of highway driving, maybe when and how to pass other vehicles, that is currently handled by a heuristic but will probably eventually be imitation learned. Recently at E3, Elon mentioned that Tesla is currently working on recognition of road surface and curbs. So, it sounds like a lot of work is ongoing in developing neural networks (e.g. curb recognition networks) and setting up the sense-filter-collect-label pipelines for the training data appropriate to these networks (e.g. images of curbs).
Tesla’s to-do list looks something like this:
Continue deploying Hardware 3 (i.e. Full Self-Driving Computer or FSD Computer) in new cars, and retrofit all eligible Hardware 2 cars with the new computer.
Develop, train, test, and deploy new versions of all the neural networks running on Hardware 2 that take advantage of the 10x-20x computing power in Hardware 3.
Develop neural networks for all remaining tasks necessary for fully autonomous driving: computer vision tasks, behaviour prediction tasks, and path planning/driving policy tasks. Or, where applicable, write a heuristic to handle the task.
Build training data pipelines for all these new neural networks, and start collecting and labelling (either automatically or manually) training data for these networks.
Once these four items are complete, then over the course of months and years, Tesla can sift through billions of miles of fleet data to train its neural networks on as much relevant data as it can manage to collect and label. At that point, Tesla’s system should surpass the systems of its competitors. If not, then something about my hypothesis is wrong.
The purpose of discussing this is to note that the limiting factor for Tesla’s progress over the next 6 months, year, or longer will probably be the software development effort that has to get done before Tesla can let its “data engine” rip. Software development is notoriously slow, hard to predict, and impervious to additional human labour. So, it could be a long, frustrating road before the “Tesla data advantage” thesis can be tested.
In addition to conventional software development and machine learning engineering, there is presumably also some amount of machine learning research. Research is risky and hard to predict, and can take a long time. So, the actual unfolding of events may not occur according to Elon’s tidy, quarter-by-quarter timeline.
Conversely, since there is potentially a long time between when work on a feature begins and when it is ready to deploy to customers, externally it might look like nothing is happening for a while until a feature suddenly drops. This might be especially the case since the different pieces of software are so interdependent. Path planning/driving policy depends on behaviour prediction and computer vision, and behaviour prediction itself relies on computer vision. Software infrastructure is shared by all three, such as Tesla’s photorealistic simulation and its internal software tools. Multiple features might be delayed in their deployment because of the a slow computer vision feature, or a slow software tool or slow work on the simulator.
Unlike hand-crafted heuristics which improve at the rate of human tinkering, once neural network training begins, performance can go from nothing to proficient in weeks or months. DeepMind spent almost 3 years developing AlphaStar, and the agents’ final training run took less than 3 weeks. If there is a long lead time setting up neural network training, the network’s performance can stay at nothing for a long time and then suddenly shoot up to proficient. Even if it takes a few tries. So, that’s why I can’t rule out Elon’s timeline based on what I see externally. If neural network progress is that non-linear, then who knows where Tesla will be at in 6 months. Elon has been way off in his timelines for autonomy software releases, but that doesn’t mean he’ll always be wrong. Even a random guess might turn out to be right.