Automatic Speaker Verification on Myanmar Spoofing Voice Data using GMM-UBM and TDNN

被引:0
作者
Phyu, Win Lai Lai [1 ]
Pa, Win Pa [1 ]
Naing, Hay Mar Soe [1 ]
机构
[1] Univ Comp Studies Yangon, Nat Language & Speech Proc Lab, Yangon, Myanmar
来源
PROCEEDINGS OF THE 6TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA IN ASIA WORKSHOPS, MMASIA 2024 WORKSHOPS | 2024年
关键词
speaker verification; robustness; voice conversion; MFCC; GMM-UBM; TDNN; FreeVC; GMMVC; DIFFGMM;
D O I
10.1145/3700410.3702121
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Artificial voices or human voice imitation pose a risk to speech verification security systems. This study investigates the effectiveness of an automatic speaker verification that utilizes converted voices, aiming to assess the susceptibility of these systems to this type of deception, as voice conversion can be regarded as a variant of voice imitation. Voice conversions among three females are used in the tests. The system performance is assessed with Gaussian Mixture Model-Universal Background Model (GMM-UBM) and Time Delay Neural Network (TDNN) model on Mel Frequency Cepstral Coefficients (MFCC) acoustic features. The objective is to evaluate the resilience of an automatic speaker verification system when subjected to three different voice conversion methods: FreeVC, GMMVC, and Differential GMM (DIFFGMM). The findings indicate that the converted voice has a high rate of error, just as the real voice has fewer error rate.
引用
收藏
页数:7
相关论文
共 21 条
[1]   Advances in phone-based modeling for automatic accent classification [J].
Angkititrakul, P ;
Hansen, JHL .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (02) :634-646
[2]  
Bartkova K., 2006, ACOUSTICS SPEECH SIG, V5, P1037
[3]  
Boulianne D., 2011, IEEE 2011 WORKSH AUT, P1, DOI DOI 10.1017/CBO9781107415324.004
[4]  
Gish H., 1986, ICASSP 86 Proceedings. IEEE-IECEJ-ASJ International Conference on Acoustics, Speech and Signal Processing (Cat. No.86CH2243-4), P865
[5]  
Ioffe S, 2006, LECT NOTES COMPUT SC, V3954, P531
[6]  
Kanagasundaram A., 2012, P SPEAK LANG REC WOR, P28
[7]  
Kobayashi K., 2018, P OD JUN 2018, P203
[8]  
Kumar A., 2020, Speech Communication, V121, P1
[9]  
Li Jingyi, 2021, WEE Transactions on Audio, Speech, and Language Processing
[10]   An overview of voice conversion systems [J].
Mohammadi, Seyed Hamidreza ;
Kain, Alexander .
SPEECH COMMUNICATION, 2017, 88 :65-82