A Caltech Library Service

A bottom–up model of spatial attention predicts human error patterns in rapid scene recognition

Einhäuser, Wolfgang and Mundhenk, T. Nathan and Baldi, Pierre and Koch, Christof and Itti, Laurent (2007) A bottom–up model of spatial attention predicts human error patterns in rapid scene recognition. Journal of Vision, 7 (10). Art. No. 6. ISSN 1534-7362. doi:10.1167/7.10.6.

PDF - Published Version
See Usage Policy.


Use this Persistent URL to link to this item:


Humans demonstrate a peculiar ability to detect complex targets in rapidly presented natural scenes. Recent studies suggest that (nearly) no focal attention is required for overall performance in such tasks. Little is known, however, of how detection performance varies from trial to trial and which stages in the processing hierarchy limit performance: bottom–up visual processing (attentional selection and/or recognition) or top–down factors (e.g., decision-making, memory, or alertness fluctuations)? To investigate the relative contribution of these factors, eight human observers performed an animal detection task in natural scenes presented at 20 Hz. Trial-by-trial performance was highly consistent across observers, far exceeding the prediction of independent errors. This consistency demonstrates that performance is not primarily limited by idiosyncratic factors but by visual processing. Two statistical stimulus properties, contrast variation in the target image and the information-theoretical measure of “surprise” in adjacent images, predict performance on a trial-by-trial basis. These measures are tightly related to spatial attention, demonstrating that spatial attention and rapid target detection share common mechanisms. To isolate the causal contribution of the surprise measure, eight additional observers performed the animal detection task in sequences that were reordered versions of those all subjects had correctly recognized in the first experiment. Reordering increased surprise before and/or after the target while keeping the target and distractors themselves unchanged. Surprise enhancement impaired target detection in all observers. Consequently, and contrary to several previously published findings, our results demonstrate that attentional limitations, rather than target recognition alone, affect the detection of targets in rapidly presented visual sequences.

Item Type:Article
Related URLs:
URLURL TypeDescription
Koch, Christof0000-0001-6482-8067
Additional Information:© 2007 ARVO. Received December 3, 2006; published June 20, 2007. This work was supported by the Swiss National Science Foundation (W.E., PA00A-111447), DARPA, NGA, NSF, ONR, the NIMH, and HFSP.
Group:Koch Laboratory (KLAB)
Funding AgencyGrant Number
Swiss National Science Foundation (SNSF)PA00A-111447
Defense Advanced Research Projects Agency (DARPA)UNSPECIFIED
Office of Naval Research (ONR)UNSPECIFIED
National Institute of Mental Health (NIMH)UNSPECIFIED
Human Frontier Science ProgramUNSPECIFIED
Subject Keywords:psychophysics, modeling, attention, saliency, RSVP
Issue or Number:10
Record Number:CaltechAUTHORS:EINjov07
Persistent URL:
Official Citation:Einhäuser, W., Mundhenk, T. N., Baldi, P., Koch, C., & Itti, L. (2007). A bottom–up model of spatial attention predicts human error patterns in rapid scene recognition. Journal of Vision, 7(10):6, 1-13,, doi:10.1167/7.10.6.
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:9686
Deposited By: Archive Administrator
Deposited On:03 Mar 2008
Last Modified:08 Nov 2021 21:01

Repository Staff Only: item control page