Konrad Kording: Why we need more/better scientific theory in neuroscience and in deep learning alike

As usual, I recommend y’all try watching at 1.5x speed.

One of the problems in neuroscience (so I hear) is that there is an ocean of data but a pond of theory. Researchers spend their careers studying one part of the brain, producing tons of understanding about that one small part, but there is little understanding of how the whole brain (or the cerebral cortex) works as a cohesive system to perceive outside phenomena from sensory input, understand the world, and produce intelligent behaviour.

Similarly, deep learning is in a phase where engineering precedes science. There is lots of practical success but little scientific theory about how it works.

I think neuroscience and deep learning can have an exciting interplay where theory of intelligence is concerned. Neuroscience is attempting to reverse engineer an existing intelligent system. AI is attempting to engineer an intelligent system from scratch. Reverse engineering and engineering come at the problem from opposite directions.

If you could understand how the brain achieves perception, comprehension, and intelligent action, then you could implement the same principles in an artificial system. So, no need for deep learning! But to understand the brain, you need to develop theories of how the brain works. A neuroscientist can look at every AI system as a model of how the brain might work (regardless of whether that is the engineer’s intent). This is fertile ground for theoretical progress.

Neuroscientists can look at deep learning systems’ performance — if it’s brain-like, then it’s worth looking at the system’s design. Does the design of the system correspond to anything we know about the brain’s design? If not, might the brain implement something similar that we just haven’t discovered yet?

AI researchers are also looking at the brain for inspiration. In a sense, every time an AI researcher tries to implement a neuroscientist principle in a deep learning system, they are conducting a neuroscience experiment. They are building a model that can test predictions.

The post above is a different way of thinking about AI progress than the way I thought about it even very recently. My strong hunch was that progress in AI would trickle down from neuroscience. This model of progress is a pipeline from basic natural science to engineering application. I’ll name this the pipeline model. But then Yann LeCun showed in his talk that in many historical examples a technology was developed before its underlying scientific principles were explicitly understood.

The big question for AI progress is: where does the knowledge of intelligence come from? With physics, since the whole world is physical, there is ample room for observation, experimentation, exploration, and discovery in all parts of life. You can discover things by accident or through cultural evolution while crafting weapons, or you can deliberately roll bowling balls down ramps to see what happens.

With biology, yes, life is all around us, but while everything is physical (or everything empirical is physical), some things are living and some things are non-living. Only interaction specifically with biological things can yield biological knowledge. You won’t get it from bowling balls.

With the cognitive sciences, an even narrower subset of things in the world exhibit intelligence. Especially if by “intelligence” we specifically mean the kind of intelligence found in birds and mammals that enables an animal to invent or discover a new behaviour that solves a problem. (Don’t know a good, short name for this. Maybe “originative intelligence” would be a good name, if one doesn’t already exist.*) The subset is even smaller if we mean the kind of intelligence found in humans that we call ”general intelligence”.

A possible model is that AI progress will come purely from engineering machine learning systems, and not at all from reverse engineering the brain. The “signal” that transmits the knowledge of intelligence to human engineers will essentially be trial and error. AI systems will get progressively better through a design process or evolutionary process (like how biological intelligent systems evolved in the first place). The trial and error model is the opposite of the pipeline model. In the pipeline model, all progress comes from neuroscience. In the trial and error model, no progress does.

The model of AI progress I described in my previous post I’ll name the loop model. Rather than a pipeline that goes in one direction from neuroscience to AI, in the loop model, engineering work in AI is seen as experimental neuroscience that feeds into the discipline of neuroscience. A lot of engineering work in deep learning is inspired by neuroscience, so effectively neuroscience feeds ideas to AI, then AI tests them, and the results of these tests feed back into neuroscience. Conversely, original, endogenous ideas in AI inspire reverse engineering work in neuroscience. It’s a two-way loop in which ideas and experimental results flow both ways.

*What we are currently trying to develop in AI is mostly non-originative intelligence, like you find in reptiles. The training process is originative, analogous to the evolutionary process for reptiles. But a robot or a virtual agent in the wild, like a reptile in the wild, doesn’t discover or invent new behaviours to solve previously unsolved problems. It simply carries out the behaviours it has already learned in training.

Continual learning/lifelong learning in AI is an attempt to endow AI systems with originative intelligence. AI researchers want to take the training process and allow it to happen in real time in an individual AI (or perhaps in a set of AIs sharing information). Continual learning is analogous to the evolutionary leap that occured in the brains of birds, mammals, and especially humans. The information flow of the evolutionary process got transported into the brain, and started running in real time rather than in multi-generational time.

Just because it was a leap forward for biology doesn’t necessarily mean it will be for AI, though. For example, if a robot is trained for 100,000 years in simulation and then deployed in the real world, the continual learning that occurs in the real world will be much slower than what occurred in simulation. This is the reverse of what happened with biological evolution and originative learning in animals. For AI, unlike animals, “evolution” might be much faster than real time originative learning.