Approaches to Improving Recognition of Underrepresented Named Entities in Hybrid ASR Systems

被引:1
|
作者
Mao, Tingzhi [1 ]
Khassanov, Yerbolat [2 ,3 ]
Pham, Van Tung [2 ]
Xu, Haihua [2 ]
Huang, Hao [1 ]
Chng, Eng Siong [2 ]
机构
[1] Xinjiang Univ, Sch Informat Sci & Engn, Urumqi, Peoples R China
[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
[3] Nazarbayev Univ, ISSAI, Baku, Azerbaijan
来源
2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2021年
基金
国家重点研发计划;
关键词
speech recognition; named entity recognition; graphemic lexicon; word lattice; word embeddings;
D O I
10.1109/ISCSLP49672.2021.9362062
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present a series of complementary approaches to improve the recognition of underrepresented named entities (NE) in hybrid ASR systems without compromising overall word error rate performance. The underrepresented words correspond to rare or out-of-vocabulary (OOV) words in the training data, and thereby can't be modeled reliably. We begin with graphemic lexicon which allows to drop the necessity of phonetic models in hybrid ASR. We study it under different settings and demonstrate its effectiveness in dealing with underrepresented NEs. Next, we study the impact of neural language model (LM) with letter-based features derived to handle infrequent words. After that, we attempt to enrich representations of underrepresented NEs in pretrained neural LM by borrowing the embedding representations of rich-represented words. This let us gain significant performance improvement on underrepresented NE recognition. Finally, we boost the likelihood scores of utterances containing NEs in the word lattices rescored by neural LMs and gain further performance improvement. The combination of the aforementioned approaches improves NE recognition by up to 42% relatively.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] Chemical named entities recognition: a review on approaches and applications
    Eltyeb, Safaa
    Salim, Naomie
    JOURNAL OF CHEMINFORMATICS, 2014, 6
  • [2] Chemical named entities recognition: a review on approaches and applications
    Safaa Eltyeb
    Naomie Salim
    Journal of Cheminformatics, 6
  • [3] CopyNE: Better Contextual ASR by Copying Named Entities
    Zhou, Shilin
    Li, Zhenghua
    Hong, Yu
    Zhang, Min
    Wang, Zhefeng
    Huai, Baoxing
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 2675 - 2686
  • [4] A system for recognition of named entities in Greek
    Boutsis, S
    Demiros, I
    Giouli, V
    Liakata, M
    Papageorgiou, H
    Piperidis, S
    NATURAL LANGUAGE PROCESSING-NLP 2000, PROCEEDINGS, 2000, 1835 : 424 - 435
  • [5] Recognition of named entities in Spanish texts
    Galicia-Haro, SN
    Gelbukh, A
    Bolshakov, IA
    MICAI 2004: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2004, 2972 : 420 - 429
  • [6] Named Entities : Recognition, classification and use
    Charton, Eric
    TRAITEMENT AUTOMATIQUE DES LANGUES, 2010, 51 (03): : 155 - 157
  • [7] LISTEN, KNOW AND SPELL: KNOWLEDGE-INFUSED SUBWORD MODELING FOR IMPROVING ASR PERFORMANCE OF OOV NAMED ENTITIES
    Das, Nilaksh
    Sunkara, Monica
    Bekal, Dhanush
    Chau, Duen Horng
    Bodapati, Sravan
    Kirchhoff, Katrin
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7887 - 7891
  • [8] Integration of recognition of named entities in the process of speech recognition
    Hatmi, Mohamed
    Jacquin, Christine
    Meignier, Sylvain
    Morin, Emmanuel
    Quiniou, Solen
    TRAITEMENT AUTOMATIQUE DES LANGUES, 2013, 54 (02): : 43 - 68
  • [9] Named Entity Recognition in Turkish with Bayesian Learning and Hybrid Approaches
    RehaYavuz, Sermet
    Kucuk, Dilek
    Yazici, Adnan
    INFORMATION SCIENCES AND SYSTEMS 2013, 2013, 264 : 129 - 138
  • [10] Approaches to Relation Extraction for Nested Named Entities
    Yandutov A.V.
    Loukachevitch N.V.
    Lobachevskii Journal of Mathematics, 2023, 44 (1) : 249 - 258