SiFDetectCracker: An Adversarial Attack Against Fake Voice Detection Based on Speaker-Irrelative Features

被引：1

作者：

Hai, Xuan ^{[1
]}

Liu, Xin ^{[1
]}

Tan, Yuan ^{[1
]}

Zhou, Qingguo ^{[1
]}

机构：

[1] Lanzhou Univ, Lanzhou, Peoples R China

来源：

PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023 | 2023年

关键词：

Adversarial Attack; Deepfake; AI-Synthesized Speech; Voice Detection;

D O I：

10.1145/3581783.3613841

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Voice is a vital medium for transmitting information. The advancement of speech synthesis technology has resulted in high-quality synthesized voices indistinguishable from human ears. These fake voices have been widely used in natural Deepfake production and other malicious activities, raising serious concerns regarding security and privacy. To deal with this situation, there have been many studies working on detecting fake voices and reporting excellent performance. However, is the story really over? In this paper, we propose SiFDetectCracker, a black-box adversarial attack framework based on Speaker-Irrelative Features (SiFs) against fake voice detection. We select background noise and mute parts before and after the speaker's voice as the primary attack features. By modifying these features in synthesized speech, the fake speech detector will make a misjudgment. Experiments show that SiFDetectCracker achieved a success rate of more than 80% in bypassing existing state-of-the-art fake voice detection systems. We also conducted several experiments to evaluate our attack approach's transferability and activation factor.

引用

页码：8552 / 8560

页数：9

共 33 条

[1] Deep Residual Neural Networks for Audio Spoofing Detection
Alzantot, Mousulfa
Wang, Ziqi
Srivastava, Mani B.
[J]. INTERSPEECH 2019, 2019, : 1078 - 1082
[2] [Anonymous], 2019, WALL STR J
[3] Deep4SNet: deep learning for fake speech classification
Ballesteros, M. Dora
Rodriguez-Ortega, Yohanna
Renza, Diego
Arce, Gonzalo
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 184
[4] Synthetic speech detection through short-term and long-term prediction traces
Borrelli, Clara
Bestagini, Paolo
Antonacci, Fabio
Sarti, Augusto
Tubaro, Stefano
[J]. EURASIP JOURNAL ON INFORMATION SECURITY, 2021, 2021 (01)
[5] Epidemiology of cerumen impaction among municipal kindergartens children in Wuhan, China
Chen Ping
Hu Yanling
Wei Youhua
Wang Shufen
Wang Zhinan
Xia Zhongfang
[J]. INTERNATIONAL JOURNAL OF PEDIATRIC OTORHINOLARYNGOLOGY, 2017, 100 : 154 - 156
[6] Chen S, 2020, 2020 12TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI), P207, DOI [10.1109/ICACI49185.2020.9177646, 10.1109/icaci49185.2020.9177646]
[7] Ge W., 2021, P AUT SPEAK VER SPOO, P22
[8] Goodfellow I., 2015, P INT C LEARN REPR
[9] Huang Rongjie, 2022, ARXIV220409934
[10] Ilyas Andrew, 2018, PMLR, P2137

← 1 2 3 4 →