REDISCOVERING 50 YEARS OF DISCOVERIES IN SPEECH AND LANGUAGE PROCESSING: A SURVEY.

被引:0
作者
Mariani, Joseph [1 ]
Francopoulo, Gil [2 ]
Paroubek, Patrick [1 ]
Vernier, Frederic [1 ]
机构
[1] CNRS, LIMSI, Paris, France
[2] Tagmatica, Paris, France
来源
2017 20TH CONFERENCE OF THE ORIENTAL CHAPTER OF THE INTERNATIONAL COORDINATING COMMITTEE ON SPEECH DATABASES AND SPEECH I/O SYSTEMS AND ASSESSMENT (O-COCOSDA) | 2017年
关键词
Speech Processing; Natural Language Processing; Text Analytics; Bibliometrics; Scientometrics; Informetrics;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We have created the NLP4NLP corpus to study the content of scientific publications in the field of speech and natural language processing. It contains articles published in 34 major conferences and journals in that field over a period of 50 years (1965-2015). comprising 65.000 documents. gathering 50.000 authors. including 325.000 references and representing approximately 270 million words. Most of these publications are in English. some are in French. German or Russian. Some are open access. others have been provided by the publishers. In order to constitute and analyze this corpus several tools have been used or developed. Some of them use Natural Language Processing methods that have been published in the corpus. hence its name. Numerous manual corrections were necessary. which demonstrated the importance of establishing standards for uniquely identifying authors. publications or resources. We have conducted various studies: evolution over time of the number of articles and authors. collaborations between authors. citations between papers and authors. evolution of research themes and identification of the authors who introduced them. measure of innovation and detection of epistemological ruptures. use of language resources. reuse of articles and plagiarism in the context of a global or comparative analysis between sources.
引用
收藏
页数:23
相关论文
共 50 条
  • [31] Speech and language processing for next-millennium communications services
    Cox, RV
    Kamm, CA
    Rabiner, LR
    Schroeter, J
    Wilpon, JG
    PROCEEDINGS OF THE IEEE, 2000, 88 (08) : 1314 - 1337
  • [32] Speech Processing Application Based on Phonetics and Phonology of the Polish Language
    Klosowski, Piotr
    COMPUTER NETWORKS, 2010, 79 : 236 - 244
  • [33] Speech and language patterns in autism: Towards natural language processing as a research and clinical tool
    Trayvick, Jadyn
    Barkley, Sarah B.
    McGowan, Alessia
    Srivastava, Agrima
    Peters, Arabella W.
    Cecchi, Guillermo A.
    Foss-Feig, Jennifer H.
    Corcoran, Cheryl M.
    PSYCHIATRY RESEARCH, 2024, 340
  • [34] A Survey on Natural Language Processing for Fake News Detection
    Oshikawa, Ray
    Qian, Jing
    Wang, William Yang
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 6086 - 6093
  • [35] A Natural Language Processing Survey on Legislative and Greek Documents
    Krasadakis, Panteleimon
    Sakkopoulos, Evangelos
    Verykios, Vassilios S.
    25TH PAN-HELLENIC CONFERENCE ON INFORMATICS WITH INTERNATIONAL PARTICIPATION (PCI2021), 2021, : 407 - 412
  • [36] A Survey of the Usages of Deep Learning for Natural Language Processing
    Otter, Daniel W.
    Medina, Julian R.
    Kalita, Jugal K.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (02) : 604 - 624
  • [37] A Survey on Backdoor Attack and Defense in Natural Language Processing
    Sheng, Xuan
    Han, Zhaoyang
    Li, Piji
    Chang, Xiangmao
    2022 IEEE 22ND INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY, QRS, 2022, : 809 - 820
  • [38] Data augmentation approaches in natural language processing: A survey
    Li, Bohan
    Hou, Yutai
    Che, Wanxiang
    AI OPEN, 2022, 3 : 71 - 90
  • [39] Local Interpretations for Explainable Natural Language Processing: A Survey
    Luo, Siwen
    Ivison, Hamish
    Han, Soyeon Caren
    Poon, Josiah
    ACM COMPUTING SURVEYS, 2024, 56 (09)
  • [40] SECNLP: A survey of embeddings in clinical natural language processing
    Kalyan, Katikapalli Subramanyam
    Sangeetha, S.
    JOURNAL OF BIOMEDICAL INFORMATICS, 2020, 101 (101)