Comparison of the effectiveness of cepstral coefficients for Russian speech synthesis detection

被引:2
作者
Efanov, Dmitry [1 ]
Aleksandrov, Pavel [1 ]
Mironov, Ilia [1 ]
机构
[1] Natl Res Nucl Univ MEPhI, Moscow Engn Phys Inst, Moscow, Russia
关键词
speech synthesis; voice spoofing attacks; text-to-speech; Automatic Speaker Verification (ASV); cepstral coefficient; convolutional neural network; LSTM; SPOOFING COUNTERMEASURE; SPEAKER VERIFICATION;
D O I
10.1007/s11416-023-00491-0
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Modern speech synthesis technologies can be used to deceive voice authentication systems, phone scams, or discredit public figures. An urgent task is to detect synthesized speech to protect against the threat of voice substitution attacks. The solution to this problem is based on the choice of cepstral coefficients that determine the quality of the cloned voice. In addition, the dataset used to train the neural network must match the language for which the synthesized speech will be detected. The paper discusses the most widely used cepstral coefficients and compares their effectiveness using two main types of neural networks. To train the network, the Russian speech dataset was developed, the use of which allows achieving the highest accuracy in determining speech synthesis in the case of deepfakes in Russian.
引用
收藏
页码:375 / 382
页数:8
相关论文
共 25 条
  • [1] Akinrinmade Adekunle A., 2019, Journal of Physics: Conference Series, V1378, DOI 10.1088/1742-6596/1378/3/032011
  • [2] A Review of Modern Audio Deepfake Detection Methods: Challenges and Future Directions
    Almutairi, Zaynab
    Elgibreen, Hebah
    [J]. ALGORITHMS, 2022, 15 (05)
  • [3] Aly Mohammed, 2022, Inform Med Unlocked, V32, P101049, DOI 10.1016/j.imu.2022.101049
  • [4] Andrusenko R.A.N.A.Y., 2022, SCI TECH J INF TECHN, V22, P1143, DOI [10.17586/2226-1494-2022-22-6-1143-1149, DOI 10.17586/2226-1494-2022-22-6-1143-1149]
  • [5] [Anonymous], PYARA RUSS BON FID S
  • [6] Voice Spoofing Countermeasure for Logical Access Attacks Detection
    Arif, Tuba
    Javed, Ali
    Alhameed, Mohammed
    Jeribi, Fathe
    Tahir, Ali
    [J]. IEEE ACCESS, 2021, 9 : 162857 - 162868
  • [7] Chettri B, 2018, 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), P5159, DOI 10.1109/ICASSP.2018.8461467
  • [8] The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation
    Chicco, Davide
    Jurman, Giuseppe
    [J]. BMC GENOMICS, 2020, 21 (01)
  • [9] Open Challenges in Synthetic Speech Detection
    Cuccovillo, Luca
    Papastergiopoulos, Christoforos
    Vafeiadis, Anastasios
    Yaroshchuk, Artem
    Aichroth, Patrick
    Votis, Konstantinos
    Tzovaras, Dimitrios
    [J]. 2022 IEEE INTERNATIONAL WORKSHOP ON INFORMATION FORENSICS AND SECURITY (WIFS), 2022,
  • [10] A robust voice spoofing detection system using novel CLS-LBP features and LSTM
    Dawood, Hussain
    Saleem, Sajid
    Hassan, Farman
    Javed, Ali
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (09) : 7300 - 7312