A Caltech Library Service

Selective visual attention enables learning and recognition of multiple objects in cluttered scenes

Walther, Dirk B. and Rutishauser, Ueli and Koch, Christof and Perona, Pietro (2005) Selective visual attention enables learning and recognition of multiple objects in cluttered scenes. Computer Vision and Image Understanding, 100 (1-2). pp. 41-63. ISSN 1077-3142. doi:10.1016/j.cviu.2004.09.004.

Full text is not posted in this repository. Consult Related URLs below.

Use this Persistent URL to link to this item:


A key problem in learning representations of multiple objects from unlabeled images is that it is a priori impossible to tell which part of the image corresponds to each individual object, and which part is irrelevant clutter. Distinguishing individual objects in a scene would allow unsupervised learning of multiple objects from unlabeled images. There is psychophysical and neurophysiological evidence that the brain employs visual attention to select relevant parts of the image and to serialize the perception of individual objects. We propose a method for the selection of salient regions likely to contain objects, based on bottom-up visual attention. By comparing the performance of David Lowe’s recognition algorithm with and without attention, we demonstrate in our experiments that the proposed approach can enable one-shot learning of multiple objects from complex scenes, and that it can strongly improve learning and recognition performance in the presence of large amounts of clutter.

Item Type:Article
Related URLs:
URLURL TypeDescription
Rutishauser, Ueli0000-0002-9207-7069
Koch, Christof0000-0001-6482-8067
Perona, Pietro0000-0002-7583-5809
Additional Information:Received 19 December 2003; accepted 29 September 2004. Available online 15 June 2005. This project was funded by the NSF Engineering Research Center for Neuromorphic Systems Engineering at Caltech, by an NSF-ITR award, the NIH, the NIMH, the Keck Foundation, and a Sloan-Swartz Fellowship to U.R. The region selection code was developed by the authors as part of the “iNVT” community effort ( We thank Laurent Itti and the anonymous reviewers for comments on previous versions of the manuscript, and Evolution Robotics for making their robotic vision software development kit available to us. High-resolution background images were provided by TNO Human Factors Research Institute, the Netherlands.
Group:Koch Laboratory (KLAB)
Funding AgencyGrant Number
Center for Neuromorphic Systems Engineering, CaltechUNSPECIFIED
W. M. Keck FoundationUNSPECIFIED
Sloan-Swartz FellowshipUNSPECIFIED
Subject Keywords:Bottom-up attention; Saliency; Selective attention; Object recognition; Object-based attention; Learning; Cluttered scenes
Issue or Number:1-2
Record Number:CaltechAUTHORS:20130816-103237248
Persistent URL:
Official Citation:Dirk Walther, Ueli Rutishauser, Christof Koch, Pietro Perona Selective visual attention enables learning and recognition of multiple objects in cluttered scenes Computer Vision and Image Understanding, Volume 100, Issues 1–2, October–November 2005, Pages 41–63
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:40541
Deposited By: KLAB Import
Deposited On:12 Jan 2008 00:19
Last Modified:09 Nov 2021 23:49

Repository Staff Only: item control page