Facebook AI: Billion-scale semi-supervised learning for state-of-the-art image and video classification

I wonder if Tesla could use a technique like this. Andrej Karpathy retweeted a Facebook AI tweet about this. Since at least February 2019, job postings for Tesla’s Autopilot division have said Tesla is looking for candidates who can:

Devise methods to use to enormous quantities of lightly labelled data in addition to a diverse set of richly labelled data.

In terms of weak labelling or light labelling, I hypothesize that driver action may be the source. For instance, if a driver goes straight through a traffic light, it is more likely to be green than red. If a driver goes straight without slowing down or changing the steering angle, it isore likely than not there is no obstacle obstructing the road.


“Quantity has a quality all it’s own” - Joseph Stalin

Yes - scaling up from tens of millions to billions of training examples is going to have a qualitative impact on accuracy and that will lead to new applications becoming feasible. NLP shows this already and video applications will quickly follow. Accuracy has a strange effect in that small numerical improvements can lead to big application breakthroughs. For example we’ve had speech recognition software for decades but the accuracy, while numerically high, has lagged what we needed to make applications for real humans. 95% wasn’t enough. 98% wasn’t enough. 99.5% is almost there, and 99.8% gets us some real utility. As you cross these thresholds new stuff becomes practical and unsupervised learning will bring some of those crossings.

Semi-supervised, weakly-supervised, and self-supervised are all subsets of ‘unsupervised’ in the sense that they don’t primarily rely on humans to label the training data - it’s mainly the back office procedures that vary. And it could well be that simple autoencoders become the most important element of this over time as they allow you to just train the model by trying to predict the future from the past, which is both fundamental to utility and a fairly direct method of training unsupervised models. Autoencoders have had a lot of great successes but it’s important to explore other options as well.

Doing this stuff will require procedural and algorithmic improvements, which are coming at a fast pace, and hardware improvements, which are also proceeding nicely. A lot of dedicated hardware is just getting into the datacenter and the application site as we speak and the techniques for using this stuff well are maturing at a blistering pace. I believe that Dojo is fundamentally targeted at doing unsupervised learning on a massive scale (Elon said as much).

“The future will be unsupervised.” - James Douma

1 Like