Annotation Tool Development for Large-Scale Corpus Creation Projects at the Linguistic Data Consortium

被引:0
|
作者
Maeda, Kazuaki [1 ]
Lee, Haejoong [1 ]
Medero, Shawn [1 ]
Medero, Julie [1 ]
Parker, Robert [1 ]
Strassel, Stephanie [1 ]
机构
[1] Univ Penn, Linguist Data Consortium, Philadelphia, PA 19104 USA
关键词
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
The Linguistic Data Consortium (LDC) creates a variety of linguistic resources - data, annotations, tools, standards and best practices - for many sponsored projects. The programming staff at LDC has created the tools and technical infrastructures to support the data creation efforts for these projects, creating tools and technical infrastructures for all aspects of data creation projects: data scouting, data collection, data selection, annotation, search, data tracking and work flow management. This paper introduces a number of samples of LDC programming staff's work, with particular focus on the recent additions and updates to the suite of software tools developed by LDC. Tools introduced include the GScout Web Data Scouting Tool, LDC Data Selection Toolkit, ACK - Annotation Collection Kit, XTrans Transcription and Speech Annotation Tool, GALE Distillation Toolkit, and the GALE MT Post Editing Work flow Management System.
引用
收藏
页码:3052 / 3056
页数:5
相关论文
共 50 条
  • [1] On the creation and the annotation of a large-scale Italian-LIS parallel corpus
    Bertoldi, Nicola
    Tiotto, Gabriele
    Prinetto, Paolo
    Piccolo, Elio
    Nunnari, Fabrizio
    Lombardo, Vincenzo
    Mazzei, Alessandro
    Damiano, Rossana
    Lesmo, Leonardo
    Del Principe, Andrea
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : A19 - A22
  • [2] Vocal development in a large-scale crosslinguistic corpus
    Cychosz, Margaret
    Cristia, Alejandrina
    Bergelson, Elika
    Casillas, Marisa
    Baudet, Gladys
    Warlaumont, Anne S.
    Scaff, Camila
    Yankowitz, Lisa
    Seidl, Amanda
    DEVELOPMENTAL SCIENCE, 2021, 24 (05)
  • [3] Evaluation of a large-scale biomedical data annotation initiative
    Lacson, Ronilda
    Pitzer, Erik
    Hinske, Christian
    Galante, Pedro
    Ohno-Machado, Lucila
    BMC BIOINFORMATICS, 2009, 10
  • [4] Evaluation of a large-scale biomedical data annotation initiative
    Ronilda Lacson
    Erik Pitzer
    Christian Hinske
    Pedro Galante
    Lucila Ohno-Machado
    BMC Bioinformatics, 10
  • [5] THE PROJECT VITA AS A DOCUMENTATION AND EVALUATION TOOL FOR LARGE-SCALE RESEARCH-AND-DEVELOPMENT PROJECTS
    SMITH, NL
    FLORINI, BM
    EVALUATION AND PROGRAM PLANNING, 1993, 16 (01) : 49 - 53
  • [6] Development of a Large-Scale Mandarin Radio Speech Corpus
    Chang, Yung-hsiang Shawn
    Liao, Yuan-fu
    Wang, Sheng-ming
    Wang, Jenq-haur
    Wang, Sing-yue
    Chen, Jhih-wei
    Chen, You-dian
    2017 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - TAIWAN (ICCE-TW), 2017,
  • [7] An Efficient and Comprehensive Labeling Tool for Large-Scale Annotation of Fundus Images
    Son, Jaemin
    Kim, Sangkeun
    Park, Sang Jun
    Jung, Kyu-Hwan
    INTRAVASCULAR IMAGING AND COMPUTER ASSISTED STENTING AND LARGE-SCALE ANNOTATION OF BIOMEDICAL DATA AND EXPERT LABEL SYNTHESIS, 2018, 11043 : 95 - 104
  • [8] DATA-PROCESSING IN LARGE-SCALE RESEARCH PROJECTS
    FLANAGAN, JC
    HARVARD EDUCATIONAL REVIEW, 1961, 31 (03) : 250 - 256
  • [9] Metadata Exploitation in Large-scale Data Migration Projects
    Narayanan, Ram
    Oberhofer, Martin
    Pandit, Sushain
    AMCIS 2012 PROCEEDINGS, 2012,
  • [10] Chemically enhanced filtration on large-scale development projects
    ECORP Consulting, Inc., 2260 Douglas Blvd., Roseville, CA 95661, United States
    1600, 353-378 (2006):