Oard, Douglas W. and Soergel, Dagobert and Doermann, David and Huang, Xiaoli and Murray, G. Craig and Wang, Jianqiang and Ramabhadran, Bhuvana and Franz, Martin and Gustman, Samuel and Mayfield, James and Kharevych, Liliya and Strassel, Stephanie (2004) Building an information retrieval test collection for spontaneous conversational speech. In: SIGIR '04 Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval. ACM , New York, NY, pp. 41-48. ISBN 1-58113-881-4. https://resolver.caltech.edu/CaltechAUTHORS:20161207-162256089
Full text is not posted in this repository. Consult Related URLs below.
Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20161207-162256089
Abstract
Test collections model use cases in ways that facilitate evaluation of information retrieval systems. This paper describes the use of search-guided relevance assessment to create a test collection for retrieval of spontaneous conversational speech. Approximately 10,000 thematically coherent segments were manually identified in 625 hours of oral history interviews with 246 individuals. Automatic speech recognition results, manually prepared summaries, controlled vocabulary indexing, and name authority control are available for every segment. Those features were leveraged by a team of four relevance assessors to identify topically relevant segments for 28 topics developed from actual user requests. Search-guided assessment yielded sufficient inter-annotator agreement to support formative evaluation during system development. Baseline results for ranked retrieval are presented to illustrate use of the collection.
Item Type: | Book Section | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Related URLs: |
| |||||||||
Additional Information: | © 2004 ACM. Thanks to Anton Leuski for help building queries and Meghan Glenn for comments. This work has been supported in part by NSF IIS Award 0122466 and NSF CISE RI Award EIA0130422. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF. | |||||||||
Funders: |
| |||||||||
Subject Keywords: | Experimentation, Measurement, Automatic Speech Recognition, Search-Guided Relevance Assessment, Oral History | |||||||||
Classification Code: | H.3.3 [ Information Storage and Retrieval ]: Information Search and Retrieval | |||||||||
DOI: | 10.1145/1008992.1009002 | |||||||||
Record Number: | CaltechAUTHORS:20161207-162256089 | |||||||||
Persistent URL: | https://resolver.caltech.edu/CaltechAUTHORS:20161207-162256089 | |||||||||
Official Citation: | Douglas W. Oard, Dagobert Soergel, David Doermann, Xiaoli Huang, G. Craig Murray, Jianqiang Wang, Bhuvana Ramabhadran, Martin Franz, Samuel Gustman, James Mayfield, Liliya Kharevych, and Stephanie Strassel. 2004. Building an information retrieval test collection for spontaneous conversational speech. In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '04). ACM, New York, NY, USA, 41-48. DOI=http://dx.doi.org/10.1145/1008992.1009002 | |||||||||
Usage Policy: | No commercial reproduction, distribution, display or performance rights in this work are provided. | |||||||||
ID Code: | 72643 | |||||||||
Collection: | CaltechAUTHORS | |||||||||
Deposited By: | Kristin Buxton | |||||||||
Deposited On: | 08 Dec 2016 00:43 | |||||||||
Last Modified: | 11 Nov 2021 05:04 |
Repository Staff Only: item control page