Unsupervised Out-of-Distribution Dialect Detection with Mahalanobis Distance

被引:0
作者
Das, Sourya Dipta [1 ]
Vadi, Yash [1 ]
Unnam, Abhishek [1 ]
Yadav, Kuldeep [1 ]
机构
[1] SHL Labs, Bangalore, Karnataka, India
来源
INTERSPEECH 2023 | 2023年
关键词
Out of Distribution Detection; Open Set Classification; Outlier Detection; Dialect Identification; Wav2vec; 2.0; Automatic Speech Recognition;
D O I
10.21437/Interspeech.2023-1974
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Dialect classification is used in a variety of applications, such as machine translation and speech recognition, to improve the overall performance of the system. In a real-world scenario, a deployed dialect classification model can encounter anomalous inputs that differ from the training data distribution, also called out-of-distribution (OOD) samples. Those OOD samples can lead to unexpected outputs, as dialects of those samples are unseen during model training. Out-of-distribution detection is a new research area that has received little attention in the context of dialect classification. Towards this, we proposed a simple yet effective unsupervised Mahalanobis distance feature-based method to detect out-of-distribution samples. We utilize the latent embeddings from all intermediate layers of a wav2vec 2.0 transformer-based dialect classifier model for multi-task learning. Our proposed approach outperforms other state-of-the-art OOD detection methods significantly.
引用
收藏
页码:1978 / 1982
页数:5
相关论文
共 31 条
[1]  
Ahamad A, 2020, PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), P5351
[2]  
Alnajjar K., 2021, ARXIV211103800
[3]  
[Anonymous], 2021, ARXIV210609022
[4]  
[Anonymous], 2021, ARXIV210100387
[5]  
[Anonymous], 2021, ARXIV210600948
[6]  
[Anonymous], 2021, ARXIV211003520
[7]  
Baevski A., 2020, ADV NEURAL INF PROCE, V33, P12449
[8]   Towards Open Set Deep Networks [J].
Bendale, Abhijit ;
Boult, Terrance E. .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1563-1572
[9]   LOF: Identifying density-based local outliers [J].
Breunig, MM ;
Kriegel, HP ;
Ng, RT ;
Sander, J .
SIGMOD RECORD, 2000, 29 (02) :93-104
[10]   MULTI-DIALECT SPEECH RECOGNITION IN ENGLISH USING ATTENTION ON ENSEMBLE OF EXPERTS [J].
Das, Amit ;
Kumar, Kshitiz ;
Wu, Jian .
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, :6244-6248