A Caltech Library Service

Vision of a Visipedia

Perona, Pietro (2010) Vision of a Visipedia. Proceedings of the IEEE, 98 (8). pp. 1526-1534. ISSN 0018-9219.

PDF - Published Version
See Usage Policy.


Use this Persistent URL to link to this item:


The web is not perfect: while text is easily searched and organized, pictures (the vast majority of the bits that one can find online) are not. In order to see how one could improve the web and make pictures first-class citizens of the web, I explore the idea of Visipedia, a visual interface for Wikipedia that is able to answer visual queries and enables experts to contribute and organize visual knowledge. Five distinct groups of humans would interact through Visipedia: users, experts, editors, visual workers, and machine vision scientists. The latter would gradually build automata able to interpret images. I explore some of the technical challenges involved in making Visipedia happen. I argue that Visipedia will likely grow organically, combining state-of-the-art machine vision with human labor.

Item Type:Article
Related URLs:
Perona, Pietro0000-0002-7583-5809
Additional Information:© 2010 IEEE. Manuscript received May 4, 2009; revised September 17, 2009; accepted April 9, 2010. Date of publication June 3, 2010; date of current version July 21, 2010. This work was supported by the California Institute of Technology (Caltech) and by the Office of Naval Research (ONR) University Research Initiative (MURI) under Grant N00014-06-1-0734. This paper was written while the author was visiting the Department of Information Engineering (DEI), University of Padova, Padova, Italy. He would like to thank Profs. G. Picci and G. Cortelazzo for their warm hospitality. The concept of Visipedia was developed in collaboration with T. Mita and P. Welinder at Caltech, and with S. Belongie and his students at the University of California San Diego (UCSD). The author would like to thank many colleagues who helped him formulate and present the ideas expressed in this paper. C. Tomasi made the author aware of Bush’s memex. J. Stevenson, K. Grauman, K. Branson, and P. Dollar read a draft and improved both language and exposition. L. Lazebnik, A. Efros, and F.-F. Li made many useful comments. The concept was presented in August 2009 at Banff, AB, Canada, during the ‘‘Vision and the Internet’’ workshop. A number of participants made useful suggestions and comments, in particular, J. Malik, K. Koutulakos, D. Forsyth, B. Aguera y Arcas, R. Szeliski, D. Hoiem, S. Seitz, and A. Zisserman.
Funding AgencyGrant Number
Office of Naval Research (ONR) University Research Initiative (MURI)N00014-06-1-0734
Subject Keywords:Crowdsourcing; image understanding; machine learning; machine vision; Visipedia; visual recognition; Wikipedia
Other Numbering System:
Other Numbering System NameOther Numbering System ID
INSPEC Accession Number11430138
Issue or Number:8
Record Number:CaltechAUTHORS:20101117-154723449
Persistent URL:
Official Citation:Perona, P.; , "Vision of a Visipedia," Proceedings of the IEEE , vol.98, no.8, pp.1526-1534, Aug. 2010 doi: 10.1109/JPROC.2010.2049621 URL:
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:20870
Deposited By: Tony Diaz
Deposited On:18 Nov 2010 21:58
Last Modified:03 Oct 2019 02:16

Repository Staff Only: item control page