University of Edinburgh
Might the frugal (but pro-active) use of neural resources be one of the essential keys to understanding how brains make sense of the world? Some recent work in computational and cognitive neuroscience suggests just such a picture. This work sheds light on the way brains like ours make sense of noisy and ambiguous sensory input. It also suggests, intriguingly, that perception, understanding and imagination are functionally co-emergent, arising as simultaneous results of a single underlying strategy known as ‘predictive coding’. This is the same strategy that saves on more mundane kinds of bandwidth, enabling the economical storage and transmission of pictures, sounds and videos using formats such as JPEG and MP3.
In the case of a picture (a black and white photo of sir Laurence Olivier playing Hamlet, to conjure a concrete image in your mind) predictive coding works by assuming that the value of each pixel is well-predicted by the value of its various neighbors. When that’s true – which is rather often, as grey-scale gradients are pretty smooth for large parts of most images – there is simply no need to transmit the value of that pixel. All that the photo-frugal need transmit are the deviations from what was thus predicted. The simplest prediction would be that neighboring pixels all share the same value (the same grey scale value, for example), but much more complex predictions are also possible. As long as there is detectable regularity, prediction (and hence this particular form of data compression) is possible.
Such compression by informed prediction (as Bell Telephone Labs first discovered back in the 1950’s) can save enormously on bandwidth, allowing quite modest encodings to be reconstructed, by in effect ‘adding back in’ the successfully predicted elements into rich and florid renditions of the original sights and sounds. The trick is trading intelligence and foreknowledge (expectations, informed predictions) on the part of the receiver against the costs of encoding and transmission on the day. A version of this same trick may be helping animals like us to sense and understand the world by allowing us to use what we already know to predict as much of the current sensory data as possible. When you think you see or hear your beloved cat or dog when the door or wind makes just the right jiggle or rustle, you are probably using well-trained prediction to fill in the gaps, saving on input-dominated bandwidth and (usually) knowing your world better as a result. Neural versions of this ‘predictive coding’ trick benefit, however, from an important added dimension: the use of a stacked hierarchy of processing stages. In biological brains, the prediction-based strategy unfolds within multiple layers each of which deploys its own specialized knowledge and resources to try to predict the states of the level below it.
This is not easy to imagine, but it rewards the effort. A familiar, but still useful, analogy is with the way problems and issues are passed up the chain of command in rather traditional management hierarchies. Each person in the chain must there learn to distil important (hence usually surprising or unpredicted) information from those lower down the chain. And they must do so in a way that is sufficiently sensitive to the needs (hence expectations) of those immediately above them. In this kind of multi-level chain, all that flows upwards is news. What flows forward are just the deviations from each level’s predicted events and unfoldings. This is efficient. Valuable bandwidth is not used sending well-predicted stuff forwards. Why bother? We were expecting all that stuff anyway. What gets marked and passed forward in the brain’s flow of processing are just the divergences from predicted states: divergences that may be used to demand more information at those very specific points, or to guide remedial action.
All this, if true, has much more than merely engineering significance. For it suggests that perception may best be seen as what has sometimes been described as a process of ‘controlled hallucination’ (Ramesh Jain) in which we (or rather, various parts of our brains) try to predict what is out there, using the incoming signal more as a means of tuning and nuancing the predictions rather than as a rich (and bandwidth-costly) encoding of the state of the world. This in turn underlines the surprising extent to which the structure of our expectations (both conscious and non-conscious) may quite literally be determining much of what we see, hear, and feel.
The basic effect hereabouts is neatly illustrated by a simple but striking demonstration (used by the neuroscientist Richard Gregory back in the 70s to make this very point) known as ‘the hollow face illusion.’ This is a well-known illusion in which an ordinary face-mask viewed from the back (which is concave, to fit your face) appears strikingly convex when viewed from a modest distance. That is, it looks (from the back) to be shaped like a real face, with the nose sticking outwards rather than having a concave nose-cavity. Just about any hollow face-mask will produce some version of this powerful illusion, and there are many examples on the web, such as this one. The hollow face illusion illustrates the power of what cognitive psychologists call ‘top-down’ (essentially, knowledge-driven) influences on perception. Our statistically salient experience with endless hordes of convex faces in daily life installs a deep expectation of convexity: an expectation that here trumps the many other visual cues that ought to be telling us that what we are seeing is a concave mask.
You might reasonably suspect that the hollow face illusion, though striking, is really just some kind of psychological oddity. And to be sure, our expectations concerning the convexity of faces seem especially strong and potent. But if the predictive coding approaches I mentioned earlier are on track, this strategy might actually pervade human perception. Brains like ours may be constantly trying to use what they already know so as to predict the current sensory signal, using the incoming signal to constrain those predictions, and sometimes using the expectations to ‘trump’ certain aspects of the incoming sensory signal itself. (Such trumping makes adaptive sense, as the capacity to use what you know to outweigh some of what the incoming signal seems to be saying can be hugely beneficial when the sensory data is noisy, ambiguous, or incomplete – situations that are, in fact, pretty much the norm in daily life.
This image of the brain (or more accurately, of sensory and motor cortex) as an engine of prediction is a simple and quite elegant one that can be found in various forms in contemporary neuroscience (for useful surveys, see Kveraga et al. (2007), Bubic et al (2010), and for a rich but challenging incarnation, see Friston (2010)). It has also been shown, at least in restricted domains, to be computationally sound and practically viable. Just suppose (if only for the sake of argument) that it is on track, and that perception is indeed a process in which incoming sensory data is constantly matched with ‘top-down’ predictions based on unconscious expectations of how that sensory data should be. This would have important implications for how we should think about minds like ours.
First, consider the unconscious expectations themselves. Those unconscious expectations derive mostly from the statistical shape of the world as we have experienced it in the past. That means we should probably be very careful about the shape of the worlds to which we expose ourselves, and our children. We see the world by applying the expectations generated by the statistical lens of our own past experience, and not (mostly) by applying the more delicately rose-nuanced lenses of our political and social aspirations. So if the world that tunes those expectations is sexist or racist, that will structure the unconscious expectations that condition humanities own future perceptions – a royal recipe for tainted evidence and self-fulfilling negative prophecies.
Second, reflect that perception (at least of this stripe) now looks to be deeply linked to something not unlike imagination. For insofar as a creature can indeed predict its own sensory inputs from the ‘top down’, such a creature is well-positioned to engage in familiar (though perhaps otherwise deeply puzzling) activities like dreaming and some kind of free-floating imagining. These would occur when the constraining sensory input is switched off, by closing down the sensors, leaving the system free to be driven purely from the top down. We should not suppose that all creatures deploying this strategy can engage in the kinds of self-conscious deliberate imagining that we do. Self-conscious deliberate imagining may well require substantial additional innovations, such as the use of language as a means of self-cuing. But where we find perception working in this way, we may expect an interior mental life of a fairly rich stripe, replete with dreams and free-floating episodes of mental imagery.
Finally, perception and understanding would also be revealed as close cousins. For to perceive the world in this way is to deploy knowledge not just about how the sensory signal should be right now, but about how it will probably change and evolve over time. For it is only by means of such longer-term and larger-scale knowledge that we can robustly match the incoming signal, moment to moment, with apt expectations (predictions). To know that (to know how the present sensory signal is likely to change and evolve over time) just is to understand a lot about how the world is, and the kinds of entity and event that populate it. Creatures deploying this strategy, when they see the grass twitch in just that certain way, are already expecting to see the tasty prey emerge, and already expecting to feel the sensations of their own muscles tensing to pounce. But an animal, or machine, that has that kind of grip on its world is already deep into the business of understanding that world.
I find the unity here intriguing. Perhaps we humans, and a great many other organisms too, are deploying a fundamental, frugal, prediction-based strategy that delivers perceiving, understanding, and imagining in a single package? Now there’s a deal!
A version of this material appeared as “Do Thrifty Brains Make Better Minds” on The Stone (philosophy blog of The New York Times) Jan 15 2012.
[Feature image by 401(K) 2012]
Bubic A, von Cramon DY and Schubotz RI (2010) Prediction, cognition and the brain. Front. Hum. Neurosci. 4:25: 1-15
Friston K. (2010) The free-energy principle: a unified brain theory? Nature Reviews: Neuroscience 11(2):127-38.
Helmholtz, H. (1860/1962). Handbuch der physiologischen optik (Southall, J. P. C. (Ed.), English trans.),Vol. 3. New York: Dover.
Kveraga, K., Ghuman, A.S., and Bar. M. (2007) Top-down predictions in the cognitive brain. Brain and Cognition, 65, 145-168