Comparison of the effectiveness of cepstral coefficients for Russian speech synthesis detection

被引：2

作者：

Efanov, Dmitry ^{[1
]}

Aleksandrov, Pavel ^{[1
]}

Mironov, Ilia ^{[1
]}

机构：

[1] Natl Res Nucl Univ MEPhI, Moscow Engn Phys Inst, Moscow, Russia

来源：

JOURNAL OF COMPUTER VIROLOGY AND HACKING TECHNIQUES | 2024年 / 20卷 / 03期

关键词：

speech synthesis; voice spoofing attacks; text-to-speech; Automatic Speaker Verification (ASV); cepstral coefficient; convolutional neural network; LSTM; SPOOFING COUNTERMEASURE; SPEAKER VERIFICATION;

D O I：

10.1007/s11416-023-00491-0

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Modern speech synthesis technologies can be used to deceive voice authentication systems, phone scams, or discredit public figures. An urgent task is to detect synthesized speech to protect against the threat of voice substitution attacks. The solution to this problem is based on the choice of cepstral coefficients that determine the quality of the cloned voice. In addition, the dataset used to train the neural network must match the language for which the synthesized speech will be detected. The paper discusses the most widely used cepstral coefficients and compares their effectiveness using two main types of neural networks. To train the network, the Russian speech dataset was developed, the use of which allows achieving the highest accuracy in determining speech synthesis in the case of deepfakes in Russian.

引用

页码：375 / 382

页数：8

共 25 条

[1] Akinrinmade Adekunle A., 2019, Journal of Physics: Conference Series, V1378, DOI 10.1088/1742-6596/1378/3/032011
[2] A Review of Modern Audio Deepfake Detection Methods: Challenges and Future Directions
Almutairi, Zaynab
Elgibreen, Hebah
[J]. ALGORITHMS, 2022, 15 (05)
[3] Aly Mohammed, 2022, Inform Med Unlocked, V32, P101049, DOI 10.1016/j.imu.2022.101049
[4] Andrusenko R.A.N.A.Y., 2022, SCI TECH J INF TECHN, V22, P1143, DOI [10.17586/2226-1494-2022-22-6-1143-1149, DOI 10.17586/2226-1494-2022-22-6-1143-1149]
[5] [Anonymous], PYARA RUSS BON FID S
[6] Voice Spoofing Countermeasure for Logical Access Attacks Detection
Arif, Tuba
Javed, Ali
Alhameed, Mohammed
Jeribi, Fathe
Tahir, Ali
[J]. IEEE ACCESS, 2021, 9 : 162857 - 162868
[7] Chettri B, 2018, 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), P5159, DOI 10.1109/ICASSP.2018.8461467
[8] The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation
Chicco, Davide
Jurman, Giuseppe
[J]. BMC GENOMICS, 2020, 21 (01)
[9] Open Challenges in Synthetic Speech Detection
Cuccovillo, Luca
Papastergiopoulos, Christoforos
Vafeiadis, Anastasios
Yaroshchuk, Artem
Aichroth, Patrick
Votis, Konstantinos
Tzovaras, Dimitrios
[J]. 2022 IEEE INTERNATIONAL WORKSHOP ON INFORMATION FORENSICS AND SECURITY (WIFS), 2022,
[10] A robust voice spoofing detection system using novel CLS-LBP features and LSTM
Dawood, Hussain
Saleem, Sajid
Hassan, Farman
Javed, Ali
[J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (09) : 7300 - 7312

← 1 2 3 →