Creating a Data Collection for Evaluating Rich Speech Retrieval

被引:0
|
作者
Eskevich, Maria [1 ]
Jones, Gareth J. F. [1 ]
Larson, Martha [2 ]
Ordelman, Roeland [3 ]
机构
[1] Dublin City Univ, Ctr Digital Video Proc, Sch Comp, Dublin 9, Ireland
[2] Delft Univ Technol, Delft, Netherlands
[3] Univ Twente, Enschede, Netherlands
来源
LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION | 2012年
基金
爱尔兰科学基金会; 欧盟第七框架计划;
关键词
Speech Search; Speech Collection Creation; Speech Retrieval; Crowdsourcing;
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
We describe the development of a test collection for the investigation of speech retrieval beyond identification of relevant content. This collection focuses on satisfying user information needs for queries associated with specific types of speech acts. The collection is based on an archive of the Internet video from Internet video sharing platform (blip.tv), and was provided by the MediaEval benchmarking initiative. A crowdsourcing approach was used to identify segments in the video data which contain speech acts, to create a description of the video containing the act and to generate search queries designed to refind this speech act. We describe and reflect on our experiences with crowdsourcing this test collection using the Amazon Mechanical Turk platform. We highlight the challenges of constructing this dataset, including the selection of the data source, design of the crowdsouring task and the specification of queries and relevant items.
引用
收藏
页码:1736 / 1743
页数:8
相关论文
共 50 条
  • [1] A Data Collection for Evaluating the Retrieval of Related Tweets to News Articles
    Suarez, Axel
    Albakour, Dyaa
    Corney, David
    Martinez, Miguel
    Esquivel, Jose
    ADVANCES IN INFORMATION RETRIEVAL (ECIR 2018), 2018, 10772 : 780 - 786
  • [2] Hybrid method of data collection for evaluating speech dialogue system
    Nakazato, S
    Kudo, I
    Shirai, K
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1996, E79D (01) : 41 - 46
  • [3] Methods for evaluating and creating data quality
    Winkler, WE
    INFORMATION SYSTEMS, 2004, 29 (07) : 531 - 550
  • [4] Scientific journals: guidelines for creating a data collection
    Drumond, Larissa Barbara Borges
    Rezende, Laura Vilela Rodrigues
    HIPERTEXT NET, 2023, (27): : 19 - 34
  • [5] Combination retrieval for creating knowledge from sparse document-collection
    Matsumura, N
    Ohsawa, Y
    Ishizuka, M
    KNOWLEDGE-BASED SYSTEMS, 2005, 18 (07) : 327 - 333
  • [6] Information retrieval test collection for searching spontaneous Czech speech
    Ircing, Pavel
    Pecina, Pavel
    Oard, Douglas W.
    Wang, Jianqiang
    White, Ryen W.
    Hoidekr, Jan
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2007, 4629 : 439 - +
  • [7] A Test Collection for Evaluating Retrieval of Studies for Inclusion in Systematic Reviews
    Scells, Harrisen
    Zuccon, Guido
    Koopman, Bevan
    Deacon, Anthony
    Azzopardi, Leif
    Geva, Shlomo
    SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 1237 - 1240
  • [8] COLLECTION STORAGE AND RETRIEVAL OF CLINICAL PSYCHOPHARMACOLOGICAL DATA
    TELLER, DN
    DENBER, HCB
    DISEASES OF THE NERVOUS SYSTEM, 1969, 30 (2S): : 60 - &
  • [9] Building by benchmarking: A method of creating and evaluating an Asian American Studies collection
    Masuchika, Glenn
    LIBRARY COLLECTIONS ACQUISITIONS & TECHNICAL SERVICES, 2012, 36 (1-2): : 1 - 7
  • [10] Structured speech input for clinical data collection
    Grasso, MA
    PROCEEDINGS OF THE 15TH IEEE SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS, 2002, : 199 - 204