Rediscovering 15+2 years of discoveries in language resources and evaluation

被引:5
作者
Mariani, Joseph [1 ,2 ]
Paroubek, Patrick [1 ]
Francopoulo, Gil [2 ,3 ]
Hamon, Olivier [4 ]
机构
[1] Univ Paris Saclay, CNRS, LIMSI, Orsay, France
[2] CNRS, IMMI, F-91405 Orsay, France
[3] Tagmatica, Paris, France
[4] Syllabs, Paris, France
基金
英国工程与自然科学研究理事会; 美国国家科学基金会;
关键词
ELRA Anthology; Language resources; Language processing systems evaluation; Text analytics; Social networks; ISLRN; Bibliometrics; Scientometrics;
D O I
10.1007/s10579-016-9352-9
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper analyzes the content of the proceedings of the Language Resources and Evaluation Conference (LREC) over the past 17 years (1998-2014), with the goal of gaining a picture of the LREC community and the topics that are most relevant to the field. We follow the methodology used in similar studies, including the survey of the IEEE ICASSP conference proceedings from 1976 to 1990, the survey of the Association of Computational Linguistics conference proceedings over 50 years, and the survey of the proceedings of the conferences contained in the ISCA Archive over 25 years (1987-2012). We expand on results originally presented at LREC 2014, but include the proceedings of LREC 2014 itself in the study together with an analysis of various citation graphs. We show the evolution over time of the number of papers and authors, including their distribution by gender and affiliation, as well as collaborations and citation patterns among authors and papers, funding sources for reported research, and plagiarism and reuse in LREC papers; results for LREC are compared with similar results for major conferences in related fields. We also consider the evolution of research topics over time and identify the authors who introduced key terms. Finally, we propose and apply a measure of a researcher's notability and provide the results for LREC authors. The study uses NLP methods that have been published in the corpus considered in the study. In addition to providing a revealing characterization of the LRE community, the study also demonstrates the need for establishing a system for unique identification of authors, papers and other sources to facilitate this type of analysis.
引用
收藏
页码:165 / 220
页数:56
相关论文
共 32 条
[1]  
[Anonymous], 2007, OXFORD TEXT ARCHIVE
[2]  
[Anonymous], 1948, HUMAN ORG
[3]  
[Anonymous], P LANG RES EV C LREC
[4]  
[Anonymous], 2008, P LANG RES EV C LREC
[5]  
[Anonymous], 2008, Introduction to information retrieval
[6]  
[Anonymous], 2012, R J, V4, P5
[7]  
[Anonymous], P ACL 2012 SPEC WORK
[8]  
Bavelas A, 1950, J ACOUST SOC AM, V57, P271, DOI DOI 10.1121/1.1906679
[9]  
Boudin F., 2013, TALN RECITAL 2013
[10]   Developing a guideline to standardize the citation of bioresources in journal articles (CoBRA) [J].
Bravo, Elena ;
Calzolari, Alessia ;
De Castro, Paola ;
Mabile, Laurence ;
Napolitani, Federica ;
Rossi, Anna Maria ;
Cambon-Thomsen, Anne .
BMC MEDICINE, 2015, 13