Voice Privacy Through Time-Scale and Pitch Modification

被引：0

作者：

Prajapati, Gauri P. ^{[1
]}

Singh, Dipesh K. ^{[1
]}

Patil, Hemant A. ^{[1
]}

机构：

[1] Dhirubhai Ambani Inst Informat & Commun Technol, Gandhinagar, Gujarat, India

来源：

PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2021 | 2024年 / 13102卷

关键词：

Voice privacy; speech perturbation; anonymization; SPEAKER;

D O I：

10.1007/978-3-031-12700-7_8

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

An attacker can fraudulently get access (instead of the genuine user) if the users' speech data has not been preserved by using any protection. Hence, it is important to protect users' speech data for which a voice privacy system can be employed. A voice privacy system is not designed based on any particular kind of attack. Instead, it is designed in a generalized way, making it as universal system. This study presents the time-scale and pitch modification-based anonymization methods to modify the speaker-dependent speech parameters (i.e., F-0) for better privacy preservation of speech data. The proposed voice privacy performance is compared with the signal processing-based baseline system of the INTERSPEECH 2020 voice privacy challenge. The authors have used various perturbation methods, concluding that speed perturbation with factor 0.8 is better to get adequate speaker anonymization (with 38.5% Equal Error Rate (EER) and 91.3% De-IDentification (DeID)) and acceptable speech intelligibility (4.86% WER) for female speakers. It is observed that speed and pitch perturbation are two important candidates for anonymization. However, the tempo perturbation is not found to be so useful for speaker anonymization.

引用

页码：72 / 80

页数：9

共 50 条

[1] Voice Privacy Using Time-Scale and Pitch Modification
Singh D.K.
Prajapati G.P.
Patil H.A.
SN Computer Science, 5 (2)
[2] Voice privacy using CycleGAN and time-scale modification
Prajapati, Gauri P.
Singh, Dipesh K.
Amin, Preet P.
Patil, Hemant A.
COMPUTER SPEECH AND LANGUAGE, 2022, 74
[3] SHAPE INVARIANT TIME-SCALE AND PITCH MODIFICATION OF SPEECH
QUATIERI, TF
MCAULAY, RJ
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1992, 40 (03) : 497 - 510
[4] NONPARAMETRIC TECHNIQUES FOR PITCH-SCALE AND TIME-SCALE MODIFICATION OF SPEECH
MOULINES, E
LAROCHE, J
SPEECH COMMUNICATION, 1995, 16 (02) : 175 - 205
[5] Time-scale and pitch modification for Chinese speech based on sinusoidal model
Zhou, J.Y.
Chai, P.Q.
Tongji Daxue Xuebao/Journal of Tongji University, 2001, 29 (03): : 312 - 316
[6] Source-filter models for time-scale pitch-scale modification of speech
Acero, A
PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 881 - 884
[7] Integrated delay and error concealment for Internet voice applications with time-scale modification
Liu, F
Kim, J
Kuo, CCJ
MULTIMEDIA SYSTEMS AND APPLICATIONS IV, 2001, 4518 : 32 - 42
[8] Adaptive playout scheduling using time-scale modification in packet voice communications
Liang, YJ
Färber, N
Girod, B
2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 1445 - 1448
[9] Enhanced shape-invariant pitch and time-scale modification for concatenative speech synthesis
Pollard, MP
Cheetham, BMG
Goodyear, CC
Edgington, MD
Lowry, A
ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1433 - 1436
[10] Time-Scale and Pitch-Scale Modification by the Phase Vocoder without Occurring the Phase Unwrapping Problem
Yoneguchi, Ryoichi
Murakami, Takahiro
2017 22ND INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2017,

← 1 2 3 4 5 →