End-to-end Named Entity Recognition from English Speech

被引:18
|
作者
Yadav, Hemant [1 ]
Ghosh, Sreyan [1 ]
Yu, Yi [2 ]
Shah, Rajiv Ratn [1 ]
机构
[1] IIIT Delhi, MIDAS, Delhi, India
[2] Natl Inst Informat, Tokyo, Japan
来源
关键词
End-to-end ASR; named entity recognition; deep learning; out of vocabulary (OOV) words;
D O I
10.21437/Interspeech.2020-2482
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Named entity recognition (NER) from text has been a widely studied problem and usually extracts semantic information from text. Until now, NER from speech is mostly studied in a twostep pipeline process that includes first applying an automatic speech recognition (ASR) system on an audio sample and then passing the predicted transcript to a NER tagger. In such cases, the error does not propagate from one step to another as both the tasks are not optimized in an end-to-end (E2E) fashion. Recent studies confirm that integrated approaches (e.g., E2E ASR) outperform sequential ones (e.g., phoneme based ASR). In this paper, we introduce a first publicly available NER annotated dataset for English speech and present an E2E approach, which jointly optimizes the ASR and NER tagger components. Experimental results show that the proposed E2E approach outperforms the classical two-step approach. We also discuss how NER from speech can be used to handle out of vocabulary (OOV) words in an ASR system.
引用
收藏
页码:4268 / 4272
页数:5
相关论文
共 50 条
  • [1] End-to-end named entity recognition for Vietnamese speech
    Nguyen, Thu-Hien
    Nguyen, Thai-Binh
    Do, Quoc-Truong
    Nguyen, Tuan-Linh
    2022 25TH CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (O-COCOSDA 2022), 2022,
  • [2] Attention-Based End-to-End Named Entity Recognition from Speech
    Porjazovski, Dejan
    Leinonen, Juho
    Kurimo, Mikko
    TEXT, SPEECH, AND DIALOGUE, TSD 2021, 2021, 12848 : 469 - 480
  • [3] End-to-end model for named entity recognition from speech without paired training data
    Mdhaffar, Salima
    Duret, Jarod
    Parcollet, Titouan
    Esteve, Yannick
    INTERSPEECH 2022, 2022, : 4068 - 4072
  • [4] END-TO-END NAMED ENTITY AND SEMANTIC CONCEPT EXTRACTION FROM SPEECH
    Ghannay, S.
    Caubriere, A.
    Esteve, Y.
    Camelin, N.
    Simonnet, E.
    Laurent, A.
    Morin, E.
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 692 - 699
  • [5] An End-to-End Named Entity Recognition Model for Chinese
    Gong, Cheng
    Tang, Jiuyang
    Li, Zhen
    2019 5TH INTERNATIONAL CONFERENCE ON MECHANICAL ENGINEERING AND AUTOMATION SCIENCE (ICMEAS 2019), 2019, 692
  • [6] An End-to-End Solution for Named Entity Recognition in eCommerce Search
    Cheng, Xiang
    Bowden, Mitchell
    Bhange, Bhushan Ramesh
    Goyal, Priyanka
    Packer, Thomas
    Javed, Faizan
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 15098 - 15106
  • [7] Research on End-To-End Nested Named Entity Recognition Method
    Deng, Liyuan
    Chen, Yanpin
    Wu, Yuefei
    Qin, Yongbin
    Huang, Ruizhang
    Zheng, Qinghua
    Tan, Xi
    Computer Engineering and Applications, 2023, 59 (07) : 278 - 284
  • [8] PERSONALIZATION OF END-TO-END SPEECH RECOGNITION ON MOBILE DEVICES FOR NAMED ENTITIES
    Sim, Khe Chai
    Beaufays, Francoise
    Guliani, Arnaud Benard Dhruv
    Kabel, Andreas
    Khare, Nikhil
    Lucassen, Tamar
    Zadrazil, Petr
    Zhang, Harry
    Johnson, Leif
    Motta, Giovanni
    Zhou, Lillian
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 23 - 30
  • [9] Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
    Amodei, Dario
    Ananthanarayanan, Sundaram
    Anubhai, Rishita
    Bai, Jingliang
    Battenberg, Eric
    Case, Carl
    Casper, Jared
    Catanzaro, Bryan
    Cheng, Qiang
    Chen, Guoliang
    Chen, Jie
    Chen, Jingdong
    Chen, Zhijie
    Chrzanowski, Mike
    Coates, Adam
    Diamos, Greg
    Ding, Ke
    Du, Niandong
    Elsen, Erich
    Engel, Jesse
    Fang, Weiwei
    Fan, Linxi
    Fougner, Christopher
    Gao, Liang
    Gong, Caixia
    Hannun, Awni
    Han, Tony
    Johannes, Lappi Vaino
    Jiang, Bing
    Ju, Cai
    Jun, Billy
    LeGresley, Patrick
    Lin, Libby
    Liu, Junjie
    Liu, Yang
    Li, Weigao
    Li, Xiangang
    Ma, Dongpeng
    Narang, Sharan
    Ng, Andrew
    Ozair, Sherjil
    Peng, Yiping
    Prenger, Ryan
    Qian, Sheng
    Quan, Zongfeng
    Raiman, Jonathan
    Rao, Vinay
    Satheesh, Sanjeev
    Seetapun, David
    Sengupta, Shubho
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [10] End-to-End Speech Recognition From the Raw Waveform
    Zeghidour, Neil
    Usunier, Nicolas
    Synnaeve, Gabriel
    Collobert, Ronan
    Dupoux, Emmanuel
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 781 - 785