Creating a Data Collection for Evaluating Rich Speech Retrieval

被引：0

作者：

Eskevich, Maria ^{[1
]}

Jones, Gareth J. F. ^{[1
]}

Larson, Martha ^{[2
]}

Ordelman, Roeland ^{[3
]}

机构：

[1] Dublin City Univ, Ctr Digital Video Proc, Sch Comp, Dublin 9, Ireland

[2] Delft Univ Technol, Delft, Netherlands

[3] Univ Twente, Enschede, Netherlands

来源：

LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION | 2012年

基金：

爱尔兰科学基金会; 欧盟第七框架计划;

关键词：

Speech Search; Speech Collection Creation; Speech Retrieval; Crowdsourcing;

D O I：

暂无

中图分类号：

H0 [语言学];

学科分类号：

030303 ; 0501 ; 050102 ;

摘要：

We describe the development of a test collection for the investigation of speech retrieval beyond identification of relevant content. This collection focuses on satisfying user information needs for queries associated with specific types of speech acts. The collection is based on an archive of the Internet video from Internet video sharing platform (blip.tv), and was provided by the MediaEval benchmarking initiative. A crowdsourcing approach was used to identify segments in the video data which contain speech acts, to create a description of the video containing the act and to generate search queries designed to refind this speech act. We describe and reflect on our experiences with crowdsourcing this test collection using the Amazon Mechanical Turk platform. We highlight the challenges of constructing this dataset, including the selection of the data source, design of the crowdsouring task and the specification of queries and relevant items.

引用

页码：1736 / 1743

页数：8

共 50 条

[1] A Data Collection for Evaluating the Retrieval of Related Tweets to News Articles
Suarez, Axel
Albakour, Dyaa
Corney, David
Martinez, Miguel
Esquivel, Jose
ADVANCES IN INFORMATION RETRIEVAL (ECIR 2018), 2018, 10772 : 780 - 786
[2] Hybrid method of data collection for evaluating speech dialogue system
Nakazato, S
Kudo, I
Shirai, K
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1996, E79D (01) : 41 - 46
[3] Methods for evaluating and creating data quality
Winkler, WE
INFORMATION SYSTEMS, 2004, 29 (07) : 531 - 550
[4] Scientific journals: guidelines for creating a data collection
Drumond, Larissa Barbara Borges
Rezende, Laura Vilela Rodrigues
HIPERTEXT NET, 2023, (27): : 19 - 34
[5] Combination retrieval for creating knowledge from sparse document-collection
Matsumura, N
Ohsawa, Y
Ishizuka, M
KNOWLEDGE-BASED SYSTEMS, 2005, 18 (07) : 327 - 333
[6] Information retrieval test collection for searching spontaneous Czech speech
Ircing, Pavel
Pecina, Pavel
Oard, Douglas W.
Wang, Jianqiang
White, Ryen W.
Hoidekr, Jan
TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2007, 4629 : 439 - +
[7] A Test Collection for Evaluating Retrieval of Studies for Inclusion in Systematic Reviews
Scells, Harrisen
Zuccon, Guido
Koopman, Bevan
Deacon, Anthony
Azzopardi, Leif
Geva, Shlomo
SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 1237 - 1240
[8] COLLECTION STORAGE AND RETRIEVAL OF CLINICAL PSYCHOPHARMACOLOGICAL DATA
TELLER, DN
DENBER, HCB
DISEASES OF THE NERVOUS SYSTEM, 1969, 30 (2S): : 60 - &
[9] Building by benchmarking: A method of creating and evaluating an Asian American Studies collection
Masuchika, Glenn
LIBRARY COLLECTIONS ACQUISITIONS & TECHNICAL SERVICES, 2012, 36 (1-2): : 1 - 7
[10] Structured speech input for clinical data collection
Grasso, MA
PROCEEDINGS OF THE 15TH IEEE SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS, 2002, : 199 - 204

← 1 2 3 4 5 →