Deepfake Speech Detection: A Spectrogram Analysis

被引：4

作者：

Firc, Anton ^{[1
]}

Malinka, Kamil ^{[1
]}

Hanacek, Petr ^{[1
]}

机构：

[1] Brno Univ Technol, Brno, Czech Republic

来源：

39TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2024 | 2024年

关键词：

Deepfake; Speech; Image-based; Deepfake Detection; Spectrogram;

D O I：

10.1145/3605098.3635911

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

The current voice biometric systems have no natural mechanics to defend against deepfake spoofing attacks. Thus, supporting these systems with a deepfake detection solution is necessary. One of the latest approaches to deepfake speech detection is representing speech as a spectrogram and using it as an input for a deep neural network. This work thus analyzes the feasibility of different spectrograms for deepfake speech detection. We compare types of them regarding their performance, hardware requirements, and speed. We show the majority of the spectrograms are feasible for deepfake detection. However, there is no general, correct answer to selecting the best spectrogram. As we demonstrate, different spectrograms are suitable for different needs.

引用

页码：1312 / 1320

页数：9

共 44 条

[1]

Ahmed ME, 2020, PROCEEDINGS OF THE 29TH USENIX SECURITY SYMPOSIUM, P2685

[2]

[Anonymous], 2017, The LJ Speech Dataset

[3]

Backstrom Tom, 2019, Spectrogram and the STFT

[4]

Bateman Jon, 2020, Technical Report, pi

[5]

Bonastre Jean-Francois, 2021, arXiv

[6]

Brewster Thomas, 2022, Fraudsters Cloned Company Director's Voice In $ 35 Million Heist, Police Find

[7] CALCULATION OF A CONSTANT-Q SPECTRAL TRANSFORM [J].

BROWN, JC .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1991, 89 (01) :425-434

[8]

Caceres J., 2021, P 2021 EDITION AUTOM, P68

[9]

Chen T., 2020, ODYSSEY, P132

[10]

Corentin Jemine, 2019, Master thesis

← 1 2 3 4 5 →