Tuesday 16 June 2015

Bayesian Inference, Predictive Coding and Delusions

This is our third of a series of posts in the papers published in an issue of Avant on Delusions. Here Rick Adams summarises his paper (co-written with Harriet R. Brown and Karl J. Friston) 'Bayesian Inference, Predictive Coding and Delusions'.

I am in training to become a psychiatrist. I have also recently completed a PhD at UCL under Prof Karl Friston , a renowned computational neuroscientist. I am part of a new field known as Computational Psychiatry (CP). CP tries to explain how various phenomena in psychiatry could be understood in terms of brain computations (see also Corlett and Fletcher 2014, Montague et al., 2012, and Adams et al. forthcoming in JNNP).

One phenomenon that ought to be amenable to a computational understanding is the formation of both ‘normal’ beliefs (i.e. beliefs which are generally agreed to be reasonable) and delusions.

There are strong theoretical reasons to suppose that we (and other organisms) form beliefs in a Bayesian way. Thomas Bayes was an 18th century mathematician who tackled ‘inverse’ probability problems. ‘Direct’ probability is the probability of some data given their causes, e.g. a fair coin toss resulting in a ‘head’. Inverse probability is the opposite, e.g. the probability of a coin being fair, given a particular distribution of heads and tails. Bayes showed how to calculate the likely causes of data given the data and pre-existing beliefs (called ‘priors’) about the existence of those causes.

This inverse (now known as Bayesian) probability problem confronts all organisms with sensory systems: they collect sensory data and wish to infer the causes of those data. Sensory data are often extremely complex and noisy, and in this case appropriate prior beliefs are required to interpret them. ‘Beliefs’ used in this sense refer to probability distributions, not folk psychological statements.

Our prior beliefs are thought to take the form of a model of how the world causes sensory data. For example, knowing someone’s identity makes predictions about their configuration of facial features, which makes predictions about how colours and edges are distributed in a visual image of their face.

This model is hierarchical, as it has many levels of increasing complexity and abstraction as you move up from the raw sensory data to edges and colours, to individual features, to faces and to identities. Hierarchical models can use ‘predictive coding’ to predict low-level data by exploiting their high-level descriptions.
In predictive coding, a unit at a given hierarchical level sends messages to one or more units at lower levels which predict their activity; discrepancies between these predictions and the actual input are then passed back up the hierarchy in the form of prediction errors. These prediction errors revise the higher-level predictions, and this hierarchical message passing continues in an iterative fashion.

Exactly which predictions ought to be changed in order to explain away a given prediction error is a crucial question for hierarchical models. The Bayesian solution to this problem is to make the biggest updates to the level whose uncertainty is greatest relative to the incoming data at the level below (Mathys et al 2011), i.e., if you are very uncertain about a person’s identity, but their face is encoded with great precision, you ought to update your beliefs about their identity a lot.

But what of delusions? As we have seen, the certainty (precision) of beliefs at different levels is crucial to the inferences the hierarchical model makes. If the beliefs at the top of the model are very uncertain, they will be vulnerable to large updates on the basis of little sensory evidence. Many delusional ideas have this characteristic: e.g. a bus leaving its stop just as I arrive could make me infer that the driver deliberately drove off as he does not like me.

There are strong neurobiological reasons (highlighted in our paper) to suppose that in schizophrenia – a disorder in which delusions are a prominent feature – there is an imbalance in the encoding of precision away from the top of the brain’s hierarchical model (where prior beliefs are more concentrated) and towards the bottom (sensory) levels. There is also a lot of evidence that subjects with a diagnosis of schizophrenia make inferences that depend less on prior beliefs: they are more resistant to visual illusions, for example.

This hypothesis by no means offers a comprehensive account of all delusions, but it does generate testable hypotheses in both cognitive and neurobiological domains that can be refined by empirical data. Notice that the problem of inverse probability is one faced by science, as well as perception.


  1. Capgras Delusion has been reported in the context of at least 15 different neuropathological conditions, only one of which is schizophrenia. Is the predictive coding account meant to explain only those cases of Capgras delusion which occur in schizophrenia, leaving unexplained all case of Capgras delusion in people without schizophrenia.

    This is a general point about delusions. For example, paranoid delusions have been reported in association with at least 70 neuropathological conditions (Manschreck, Comprehensive Psychiatry, 1979), only 5 of which are psychiatric. Is the predictive coding theory of delusion meant to apply to paranoid delusions in general or only paranoid delusions in schizophrenia (leaving the other 65 types of case unexplained?)

  2. Thanks very much for your comment Max. You raise a critical point, which is that delusions are so heterogeneous that at least some of them seem to be able to accommodate any theory. Thus bad theories never die and science stalls…

    If the brain uses predictive coding, then predictive coding will always be relevant to understanding at least some aspects of the products of abnormal brain function, such as delusions. The critical question is at what level the pathology is best understood - e.g. is there something wrong with the way predictive coding is performing inference, or is there just a big bit of the network missing?

    When I used faces as an example of the balance between expectations and evidence I wasn't referring to Capgras delusion, but the latter is worth thinking about. In Capgras delusion, it seems that a) the sufferer has an expectation about how they ought to feel when they see someone's face, and b) this expectation (i.e. prediction) is violated, and c) this violation is resolved by the inference that this face belongs to an imposter.

    This chain of events could occur for several reasons. In our predictive coding account of schizophrenia, we propose that there is an imbalance in synaptic gain in the brain's hierarchy, with too much at the bottom and not enough at the top: this makes sensory level prediction errors overly precise, and prior beliefs overly imprecise or 'noisy'. Hence sensory impressions seem vivid and/or unreal, and beliefs about the world are vulnerable to large (and possibly unusual) updates.

    However, a similar result might occur through a specific lesion to pathways connecting emotional areas with face recognition areas, and a lesion to prefrontal areas, etc. In this case there is no generalised problem with synaptic gain, but there are more macroscopic problems with the network: nevertheless, in both cases, prediction errors are being explained away.

    How to differentiate between these perspectives? If there is a generalised imbalance in synaptic gain then one would expect other abnormalities, such as a resistance to visual illusions, and improved performance on the force-matching task. I would have thought these would be unlikely in patients with very specific macroscopic lesions.

    As an aside, I have long agreed with your point that one needs to explain both why unusual ideas arise and why they are not rejected. The nice thing about a hierarchical imbalance in synaptic gain is that it can potentially account for both factors. On its own, however, it doesn't yet explain why certain themes arise more often than others. Perhaps paranoia is so common because inferences about others’ intentions are amongst the most difficult, and hence are most vulnerable to a loss of precision in prefrontal cortex? Uncertainty about others’ intentions may then be subject to the ancient prior belief that in altered circumstances, other agents are threats (until counterevidence indicates otherwise). Such hypotheses remain very speculative until we can find reliable indices of synaptic gain in different brain areas – both biological (e.g. resting state functional connectivity) and computational (e.g. softmax decision ‘noise’ in the beads task) – that correlate with delusional presence or intensity.


Comments are moderated.