| Latest News | |||
|---|---|---|---|
| Paper on attentional tracking published Tuesday, October 06 CNN reports on our speech perception paper | Wednesday, March 04 | ||
![]() |
| Latest News | |||
|---|---|---|---|
| Paper on attentional tracking published Tuesday, October 06 CNN reports on our speech perception paper | Wednesday, March 04 | ||
![]() |
Our laboratory studies the behavioral and neural mechanisms of perception using a combination of psychophysics and neural modeling. In particular, we investigate how the brain represents and processes uncertain information. Sensory information received by the brain is typically uncertain (for instance because of poor signal quality or ambiguity in the world), yet it must constantly be manipulated to generate accurate, task-relevant behavior. Much of what we do in the lab follows the following logic:

The rest of this page explains this approach in some more detail.
High-level perception
We are mostly interested in forms of perception that require the extraction of high-level, relatively abstract knowledge from a set of noisy stimuli. Here are a few examples:
Psychophysics
We study such behaviors in humans in simplified laboratory settings, so that we can isolate the essential mechanisms. A psychophysics experiment in our lab typically involves presenting brief visual or auditory stimuli to a subject and asking him/her to report something about them. In this way, a large amount of behavioral data can be collected in a short time.
Bayesian modeling
We model our data using a principled approach called Bayesian modeling. The starting point is to ask: what is the best possible way in which an observer could solve the task based on noisy sensory information? The answer to this question is given by a strategy called Bayesian inference or optimal inference; an observer following this strategy is called a Bayesian or optimal observer. Optimal inference requires integrating pieces of information in a way that takes into account their uncertainty. There is extensive evidence from many different perceptual tasks that humans (and monkeys) indeed do this and are close-to-optimal observers. However, for tasks like the ones listed above, Bayesian models have not been explored as much as for simpler tasks. In our lab, we attempt to establish whether human behavior in complex perceptual tasks can be best described by a Bayesian observer model. (If not, we explore the nature of the deviations.)
For an introduction to Bayesian modeling, read this entry in the SAGE Encyclopedia of perception.
Neural coding

A key quest in neuroscience is to understand the relationship between behavior and neural activity. We contribute to this by developing plausible neural implementations of what human subjects do in our perceptual tasks. Instead of trying ad hoc neural networks, we are guided by our Bayesian models of behavior. Neural populations maintain representations of uncertainty (or even of entire probability distributions) and can use these to perform optimal computation. By connecting several such populations, we build a network for a given task, whose performance we can compare with that of an optimal observer. This involves mathematical work and simulations. The resulting networks are capable of approximating optimal inference regardless of the inputs. The next step is to make such networks more biologically realistic and produce testable predictions.
Visual search
The ability to find a target among distractors depends on set size, target-distractor similarity, and distractor heterogeneity. Early research on visual search explained these dependencies in terms of limited capacity or a distinction between serial and parallel processes. In later work, however, it was found that signal detection theory (SDT) could quantitatively account much better for human data, while making only minimal assumptions. SDT is a probabilistic theory in which the internal representation of a stimulus variable is continuous and corrupted by noise. To date, SDT is the most successful theory to describe the psychophysics of visual search.
In our laboratory, we work on a neural Bayesian theory of visual search, which is a natural and principled generalization of SDT models. At each location in a search display, a neural population encodes not just the most likely value of the stimulus, but an entire probability distribution over the stimulus. The Bayesian theory predicts how such probabilistic information is integrated across the display to obtain a global judgment about target presence or absence. We currently investigate how neural networks can perform Bayes-optimal visual search.
Visual short-term memory
Visual short-term memory (VSTM) is a form of memory that lasts between 1 and several tens of seconds. It is a temporary buffer for visual information. One use of VSTM is in change detection. The remembered information allows to compare a current visual scene with one several seconds ago. It is commonly believed that visual short-term memory has a limited capacity of 3 to 4 items (the "magical number 4"). In this view, an item in a display is remembered either perfectly or not at all. There is considerable evidence against this model, but a theoretical framework has been lacking so far. We are working on an alternative model in which the appearance of limited capacity arises from uncertainty increasing with set size. In this view, VSTM encoding is not all-or-none, but graded - you remember many items a little bit, and less well if you have more to remember. We perform change detection and delayed estimation experiments to measure human performance as a function of set size. This allows us to test limited-capacity against uncertainty-based models.
The founding paper of uncertainty-based models is Wilken and Ma (2004). A comparison between the two types of models applied to a different problem (multiple object tracking) can be found in Ma and Huang (2009).
Multisensory perception
To gather information about the world, humans constantly combine signals from different sensory modalities. For example, when we cross the street, we use both visual and auditory information about the cars approaching us. Laboratory experiments using multisensory stimuli have revealed that the brain - unconsciously - takes into account the relative reliabilities of the signals when combining them: more reliable information is valued more. This is a form of Bayesian inference. Moreover, when a small conflict is introduced between two sensory signals, it is possible to manipulate what observers perceive by changing the relative reliabilities of the signals. This is the principle that underlies multisensory illusions like the McGurk effect and the ventriloquist effect.
In our laboratory, we study different aspects of multisensory perception from the perspective of Bayesian inference. In Ma, Beck et al. (2006), we showed how optimal Bayesian inference can be implemented in neural circuits. We have also built a theory of how the brain decides whether two sensory signals have a common source or different sources. This decision-making process is called causal inference. In Ma, Zhou et al. (2009), we showed that open-set word recognition using video and noisy audio can be understood as Bayesian inference. Current projects deal with experimental and neurotheoretical aspects of causal inference. Eventually, we would like to rewrite the principles of multisensory perception in a mathematically rigorous way.