Itti, Laurent and Gold, Carl and Koch, Christof (2001) Visual attention and target detection in cluttered natural scenes. Optical Engineering, 40 (9). 1784-1793 . ISSN 0091-3286 . http://resolver.caltech.edu/CaltechAUTHORS:20130816-103153992
- Published Version
See Usage Policy.
Use this Persistent URL to link to this item: http://resolver.caltech.edu/CaltechAUTHORS:20130816-103153992
Rather than attempting to fully interpret visual scenes in a parallel fashion, biological systems appear to employ a serial strategy by which an attentional spotlight rapidly selects circumscribed regions in the scene for further analysis. The spatiotemporal deployment of attention has been shown to be controlled by both bottom-up (image-based) and top-down (volitional) cues. We describe a detailed neuromimetic computer implementation of a bottom-up scheme for the control of visual attention, focusing on the problem of combining information across modalities (orientation, intensity, and color information) in a purely stimulusdriven manner. We have applied this model to a wide range of target detection tasks, using synthetic and natural stimuli. Performance has, however, remained difficult to objectively evaluate on natural scenes, because no objective reference was available for comparison. We present predicted search times for our model on the Search–2 database of rural scenes containing a military vehicle. Overall, we found a poor correlation between human and model search times. Further analysis, however, revealed that in 75% of the images, the model appeared to detect the target faster than humans (for comparison, we calibrated the model’s arbitrary internal time frame such that 2 to 4 image locations were visited per second). It seems that this model, which had originally been designed not to find small, hidden military vehicles, but rather to find the few most obviously conspicuous objects in an image, performed as an efficient target detector on the Search–2 dataset. Further developments of the model are finally explored, in particular through a more formal treatment of the difficult problem of extracting suitable low-level features to be fed into the saliency map.
|Additional Information:||© 2001 Society of Photo-Optical Instrumentation Engineers. Paper ATA-04 received Jan. 20, 2001; revised manuscript received Mar. 2, 2001; accepted for publication Mar. 23, 2001. We thank Dr. A. Toet from TNO-HFRI for providing us with the Search_2 dataset and all human data. This work was supported by NSF (Caltech ERC), NIMH, ONR, NATO, the Charles Lee Powell Foundation, and the USC School of Engineering. The original version of this material was first published by the Research and Technology Organization, North Atlantic Treaty Organization (RTO/NATO) in MP-45 (Search and Target Acquisition) in March 2000. This proceedings is available at [http://www.cso.nato.int/Pubs/rdp.asp?RDP=RTO-MP-045].|
|Group:||Koch Laboratory, KLAB|
|Subject Keywords:||visual attention; saliency; preattentive; inhibition of return; winner take all; bottom-up; natural scene; Search–2 dataset|
|Official Citation:||Itti L, Gold C, Koch C; Visual attention and target detection in cluttered natural scenes. Opt. Eng. 0001;40(9):1784-1793.|
|Usage Policy:||No commercial reproduction, distribution, display or performance rights in this work are provided.|
|Deposited By:||KLAB Import|
|Deposited On:||11 Jan 2008 23:05|
|Last Modified:||21 Nov 2013 19:45|
Repository Staff Only: item control page