SiFDetectCracker: An Adversarial Attack Against Fake Voice Detection Based on Speaker-Irrelative Features

被引:1
作者
Hai, Xuan [1 ]
Liu, Xin [1 ]
Tan, Yuan [1 ]
Zhou, Qingguo [1 ]
机构
[1] Lanzhou Univ, Lanzhou, Peoples R China
来源
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023 | 2023年
关键词
Adversarial Attack; Deepfake; AI-Synthesized Speech; Voice Detection;
D O I
10.1145/3581783.3613841
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Voice is a vital medium for transmitting information. The advancement of speech synthesis technology has resulted in high-quality synthesized voices indistinguishable from human ears. These fake voices have been widely used in natural Deepfake production and other malicious activities, raising serious concerns regarding security and privacy. To deal with this situation, there have been many studies working on detecting fake voices and reporting excellent performance. However, is the story really over? In this paper, we propose SiFDetectCracker, a black-box adversarial attack framework based on Speaker-Irrelative Features (SiFs) against fake voice detection. We select background noise and mute parts before and after the speaker's voice as the primary attack features. By modifying these features in synthesized speech, the fake speech detector will make a misjudgment. Experiments show that SiFDetectCracker achieved a success rate of more than 80% in bypassing existing state-of-the-art fake voice detection systems. We also conducted several experiments to evaluate our attack approach's transferability and activation factor.
引用
收藏
页码:8552 / 8560
页数:9
相关论文
共 33 条
  • [1] Deep Residual Neural Networks for Audio Spoofing Detection
    Alzantot, Mousulfa
    Wang, Ziqi
    Srivastava, Mani B.
    [J]. INTERSPEECH 2019, 2019, : 1078 - 1082
  • [2] [Anonymous], 2019, WALL STR J
  • [3] Deep4SNet: deep learning for fake speech classification
    Ballesteros, M. Dora
    Rodriguez-Ortega, Yohanna
    Renza, Diego
    Arce, Gonzalo
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 184
  • [4] Synthetic speech detection through short-term and long-term prediction traces
    Borrelli, Clara
    Bestagini, Paolo
    Antonacci, Fabio
    Sarti, Augusto
    Tubaro, Stefano
    [J]. EURASIP JOURNAL ON INFORMATION SECURITY, 2021, 2021 (01)
  • [5] Epidemiology of cerumen impaction among municipal kindergartens children in Wuhan, China
    Chen Ping
    Hu Yanling
    Wei Youhua
    Wang Shufen
    Wang Zhinan
    Xia Zhongfang
    [J]. INTERNATIONAL JOURNAL OF PEDIATRIC OTORHINOLARYNGOLOGY, 2017, 100 : 154 - 156
  • [6] Chen S, 2020, 2020 12TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI), P207, DOI [10.1109/ICACI49185.2020.9177646, 10.1109/icaci49185.2020.9177646]
  • [7] Ge W., 2021, P AUT SPEAK VER SPOO, P22
  • [8] Goodfellow I., 2015, P INT C LEARN REPR
  • [9] Huang Rongjie, 2022, ARXIV220409934
  • [10] Ilyas Andrew, 2018, PMLR, P2137