Voice Deepfake Detection Using the Self-Supervised Pre-Training Model HuBERT

被引:3
|
作者
Li, Lanting [1 ]
Lu, Tianliang [1 ]
Ma, Xingbang [1 ]
Yuan, Mengjiao [1 ]
Wan, Da [1 ]
机构
[1] Peoples Publ Secur Univ China, Coll Informat & Cyber Secur, Beijing 100038, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 14期
关键词
voice deepfake detection; self-supervised learning; pre-training; feature map scaling; anti-spoofing;
D O I
10.3390/app13148488
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In recent years, voice deepfake technology has developed rapidly, but current detection methods have the problems of insufficient detection generalization and insufficient feature extraction for unknown attacks. This paper presents a forged speech detection method (HuRawNet2_modified) based on a self-supervised pre-trained model (HuBERT) to improve detection (and address the above problems). A combination of impulsive signal-dependent additive noise and additive white Gaussian noise was adopted for data boosting and augmentation, and the HuBERT model was fine-tuned on different language databases. On this basis, the size of the extracted feature maps was modified independently by the & alpha;-feature map scaling (& alpha;-FMS) method, with a modified end-to-end method using the RawNet2 model as the backbone structure. The results showed that the HuBERT model could extract features more comprehensively and accurately. The best evaluation indicators were an equal error rate (EER) of 2.89% and a minimum tandem detection cost function (min t-DCF) of 0.2182 on the database of the ASVspoof2021 LA challenge, which verified the effectiveness of the detection method proposed in this paper. Compared with the baseline systems in databases of the ASVspoof 2021 LA challenge and the FMFCC-A, the values of EER and min t-DCF decreased. The results also showed that the self-supervised pre-trained model with fine-tuning can extract acoustic features across languages. And the detection can be slightly improved when the languages of the pre-trained database, and the fine-tuned and tested database are the same.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] ENHANCING THE DOMAIN ROBUSTNESS OF SELF-SUPERVISED PRE-TRAINING WITH SYNTHETIC IMAGES
    Hassan, Mohamad N. C.
    Bhattacharya, Avigyan
    da Costa, Victor G. Turrisi
    Banerjee, Biplab
    Ricci, Elisa
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 5470 - 5474
  • [22] Joint Encoder-Decoder Self-Supervised Pre-training for ASR
    Arunkumar, A.
    Umesh, S.
    INTERSPEECH 2022, 2022, : 3418 - 3422
  • [23] Self-Supervised Pre-Training Joint Framework: Assisting Lightweight Detection Network for Underwater Object Detection
    Wang, Zhuo
    Chen, Haojie
    Qin, Hongde
    Chen, Qin
    JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2023, 11 (03)
  • [24] Progressive self-supervised learning: A pre-training method for crowd counting
    Gu, Yao
    Zheng, Zhe
    Wu, Yingna
    Xie, Guangping
    Ni, Na
    PATTERN RECOGNITION LETTERS, 2025, 188 : 148 - 154
  • [25] DenseCL: A simple framework for self-supervised dense visual pre-training
    Wang, Xinlong
    Zhang, Rufeng
    Shen, Chunhua
    Kong, Tao
    VISUAL INFORMATICS, 2023, 7 (01) : 30 - 40
  • [26] Class incremental learning with self-supervised pre-training and prototype learning
    Liu, Wenzhuo
    Wu, Xin-Jian
    Zhu, Fei
    Yu, Ming-Ming
    Wang, Chuang
    Liu, Cheng-Lin
    PATTERN RECOGNITION, 2025, 157
  • [27] GUIDED CONTRASTIVE SELF-SUPERVISED PRE-TRAINING FOR AUTOMATIC SPEECH RECOGNITION
    Khare, Aparna
    Wu, Minhua
    Bhati, Saurabhchand
    Droppo, Jasha
    Maas, Roland
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 174 - 181
  • [28] MULTI-TASK SELF-SUPERVISED PRE-TRAINING FOR MUSIC CLASSIFICATION
    Wu, Ho-Hsiang
    Kao, Chieh-Chi
    Tang, Qingming
    Sun, Ming
    McFee, Brian
    Bello, Juan Pablo
    Wang, Chao
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 556 - 560
  • [29] Masked Deformation Modeling for Volumetric Brain MRI Self-Supervised Pre-Training
    Lyu, Junyan
    Bartlett, Perry F.
    Nasrallah, Fatima A.
    Tang, Xiaoying
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2025, 44 (03) : 1596 - 1607
  • [30] Self-supervised depth super-resolution with contrastive multiview pre-training
    Qiao, Xin
    Ge, Chenyang
    Zhao, Chaoqiang
    Tosi, Fabio
    Poggi, Matteo
    Mattoccia, Stefano
    NEURAL NETWORKS, 2023, 168 : 223 - 236