ENHANCED BERT-BASED RANKING MODELS FOR SPOKEN DOCUMENT RETRIEVAL

被引:0
作者
Lin, Hsiao-Yun [1 ]
Lo, Tien-Hong [1 ]
Chen, Berlin [1 ,2 ]
机构
[1] Natl Taiwan Normal Univ, Taipei, Taiwan
[2] Pervas Artificial Intelligence Res PAIR Labs, Taipei, Taiwan
来源
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019) | 2019年
关键词
Spoken document retrieval; information retrieval; speech recognition; model augmentation; BERT; INFORMATION-RETRIEVAL;
D O I
10.1109/asru46091.2019.9003890
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Bidirectional Encoder Representations from Transformers (BERT) model has recently achieved record-breaking success on many natural language processing (NLP) tasks such as question answering and language understanding. However, relatively little work has been done on ad-hoc information retrieval (IR), especially for spoken document retrieval (SDR). This paper adopts and extends BERT for SDR, while its contributions are at least three-fold. First, we augment BERT with extra language features such as unigram and inverse document frequency (IDF) statistics to make it more applicable to SDR. Second, we also explore the incorporation of confidence scores into document representations to see if they could help alleviate the negative effects resulting from imperfect automatic speech recognition (ASR). Third, we conduct a comprehensive set of experiments to compare our BERT-based ranking methods with other state-of-the-art ones and investigate the synergy effect of them as well.
引用
收藏
页码:601 / 606
页数:6
相关论文
共 33 条
[1]  
[Anonymous], P INTERSPEECH
[2]  
[Anonymous], 2017, ARXIV170501509
[3]  
[Anonymous], 2008, Introduction to information retrieval
[4]  
[Anonymous], 2011, MINING MASSIVE DATAS
[5]  
[Anonymous], 2013, J MACH LEARN RES
[6]  
[Anonymous], 2000, PROJ TOP DET TRACK
[7]  
[Anonymous], 2003, KLUWER INT SERIES IN
[8]  
Baeza-Yates R.A., 2011, Modern Information Retrieval: The Concepts and Technology Behind Search
[9]  
Chelba C, 2008, IEEE SIGNAL PROC MAG, V25, P39, DOI 10.1109/MSP.200S.917992
[10]   Spoken Document Retrieval With Unsupervised Query Modeling Techniques [J].
Chen, Berlin ;
Chen, Kuan-Yu ;
Chen, Pei-Ning ;
Chen, Yi-Wen .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (09) :2602-2612