End-to-end Named Entity Recognition from English Speech

被引：18

作者：

Yadav, Hemant ^{[1
]}

Ghosh, Sreyan ^{[1
]}

Yu, Yi ^{[2
]}

Shah, Rajiv Ratn ^{[1
]}

机构：

[1] IIIT Delhi, MIDAS, Delhi, India

[2] Natl Inst Informat, Tokyo, Japan

来源：

INTERSPEECH 2020 | 2020年

关键词：

End-to-end ASR; named entity recognition; deep learning; out of vocabulary (OOV) words;

D O I：

10.21437/Interspeech.2020-2482

中图分类号：

R36 [病理学]; R76 [耳鼻咽喉科学];

学科分类号：

100104 ; 100213 ;

摘要：

Named entity recognition (NER) from text has been a widely studied problem and usually extracts semantic information from text. Until now, NER from speech is mostly studied in a twostep pipeline process that includes first applying an automatic speech recognition (ASR) system on an audio sample and then passing the predicted transcript to a NER tagger. In such cases, the error does not propagate from one step to another as both the tasks are not optimized in an end-to-end (E2E) fashion. Recent studies confirm that integrated approaches (e.g., E2E ASR) outperform sequential ones (e.g., phoneme based ASR). In this paper, we introduce a first publicly available NER annotated dataset for English speech and present an E2E approach, which jointly optimizes the ASR and NER tagger components. Experimental results show that the proposed E2E approach outperforms the classical two-step approach. We also discuss how NER from speech can be used to handle out of vocabulary (OOV) words in an ASR system.

引用

页码：4268 / 4272

页数：5

共 50 条

[1] End-to-end named entity recognition for Vietnamese speech
Nguyen, Thu-Hien
Nguyen, Thai-Binh
Do, Quoc-Truong
Nguyen, Tuan-Linh
2022 25TH CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (O-COCOSDA 2022), 2022,
[2] Attention-Based End-to-End Named Entity Recognition from Speech
Porjazovski, Dejan
Leinonen, Juho
Kurimo, Mikko
TEXT, SPEECH, AND DIALOGUE, TSD 2021, 2021, 12848 : 469 - 480
[3] End-to-end model for named entity recognition from speech without paired training data
Mdhaffar, Salima
Duret, Jarod
Parcollet, Titouan
Esteve, Yannick
INTERSPEECH 2022, 2022, : 4068 - 4072
[4] END-TO-END NAMED ENTITY AND SEMANTIC CONCEPT EXTRACTION FROM SPEECH
Ghannay, S.
Caubriere, A.
Esteve, Y.
Camelin, N.
Simonnet, E.
Laurent, A.
Morin, E.
2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 692 - 699
[5] An End-to-End Named Entity Recognition Model for Chinese
Gong, Cheng
Tang, Jiuyang
Li, Zhen
2019 5TH INTERNATIONAL CONFERENCE ON MECHANICAL ENGINEERING AND AUTOMATION SCIENCE (ICMEAS 2019), 2019, 692
[6] An End-to-End Solution for Named Entity Recognition in eCommerce Search
Cheng, Xiang
Bowden, Mitchell
Bhange, Bhushan Ramesh
Goyal, Priyanka
Packer, Thomas
Javed, Faizan
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 15098 - 15106
[7] Research on End-To-End Nested Named Entity Recognition Method
Deng, Liyuan
Chen, Yanpin
Wu, Yuefei
Qin, Yongbin
Huang, Ruizhang
Zheng, Qinghua
Tan, Xi
Computer Engineering and Applications, 2023, 59 (07) : 278 - 284
[8] PERSONALIZATION OF END-TO-END SPEECH RECOGNITION ON MOBILE DEVICES FOR NAMED ENTITIES
Sim, Khe Chai
Beaufays, Francoise
Guliani, Arnaud Benard Dhruv
Kabel, Andreas
Khare, Nikhil
Lucassen, Tamar
Zadrazil, Petr
Zhang, Harry
Johnson, Leif
Motta, Giovanni
Zhou, Lillian
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 23 - 30
[9] Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
Amodei, Dario
Ananthanarayanan, Sundaram
Anubhai, Rishita
Bai, Jingliang
Battenberg, Eric
Case, Carl
Casper, Jared
Catanzaro, Bryan
Cheng, Qiang
Chen, Guoliang
Chen, Jie
Chen, Jingdong
Chen, Zhijie
Chrzanowski, Mike
Coates, Adam
Diamos, Greg
Ding, Ke
Du, Niandong
Elsen, Erich
Engel, Jesse
Fang, Weiwei
Fan, Linxi
Fougner, Christopher
Gao, Liang
Gong, Caixia
Hannun, Awni
Han, Tony
Johannes, Lappi Vaino
Jiang, Bing
Ju, Cai
Jun, Billy
LeGresley, Patrick
Lin, Libby
Liu, Junjie
Liu, Yang
Li, Weigao
Li, Xiangang
Ma, Dongpeng
Narang, Sharan
Ng, Andrew
Ozair, Sherjil
Peng, Yiping
Prenger, Ryan
Qian, Sheng
Quan, Zongfeng
Raiman, Jonathan
Rao, Vinay
Satheesh, Sanjeev
Seetapun, David
Sengupta, Shubho
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
[10] End-to-End Speech Recognition From the Raw Waveform
Zeghidour, Neil
Usunier, Nicolas
Synnaeve, Gabriel
Collobert, Ronan
Dupoux, Emmanuel
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 781 - 785

← 1 2 3 4 5 →